Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (4,670)

Search Parameters:
Keywords = classification tree

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
37 pages, 1707 KB  
Article
A Consolidated Framework for the Detection of Alzheimer’s Disease Using EEG Signals and Hybrid Models
by Sunil Kumar Prabhakar and Dong-Ok Won
Biomimetics 2026, 11(5), 348; https://doi.org/10.3390/biomimetics11050348 - 15 May 2026
Abstract
Alzheimer’s disease (AD) is a serious neurodegenerative disorder that can severely affect behavior and thinking patterns, and is accompanied by frequent memory loss. The early diagnosis of AD is essential, as this can benefit the patient, but detecting AD is a complex process [...] Read more.
Alzheimer’s disease (AD) is a serious neurodegenerative disorder that can severely affect behavior and thinking patterns, and is accompanied by frequent memory loss. The early diagnosis of AD is essential, as this can benefit the patient, but detecting AD is a complex process due to the nature of its associated clinical data. Electroencephalography (EEG) serves as a promising and cost-effective technique for analyzing AD-related brain activity patterns. In this work, a consolidated framework for detecting AD using EEG signals and hybrid models is proposed that uses a dataset that is available online. For the feature extraction module, five efficient techniques—Principal Component Analysis (PCA), Kernel Partial Least Squares (KPLS), Kriging Model, Isomap, and K-means clustering—are used. For feature selection, with the help of biomimetics-based concepts, three efficient algorithms are used: hybrid Cuckoo Search Optimization–Rat Swarm Optimization (CSO-RSO), Zebra Optimization (ZOA), and hybrid Gravitational Search Algorithm–Particle Swarm Optimization (GSA-PSO). Four interesting hybrid classifiers are utilized here to detect AD using EEG signals—hybrid Extreme Learning Machine–Adaboost (ELM–Adaboost), hybrid Classification and Regression Trees–Adaboost (CART–Adaboost), and hybrid weighted broad learning system-based Adaboost (HWBLSA), followed by a hybrid machine learning classification model with a soft voting technique—and, finally, these are compared with other standard machine learning classifiers. The highest classification accuracy of 98.71% is found when the Kriging Model feature extraction concept is combined with the hybrid GSA-PSO feature selection method and classified with the ELM–Adaboost classifier. Full article
(This article belongs to the Section Biological Optimisation and Management)
20 pages, 1998 KB  
Systematic Review
Machine Learning and Deep Learning for Wildfire Prediction: A Systematic and Bibliometric Review of Methods, Data Practices, and Reproducibility (2020–2025)
by Kevin Manuel Galván Lara, Yosune Miquelajauregui, Luis Fernando Enriquez Ocaña, Alf Enrique Meling-López, Christoph Neger, John Abatzoglou, Leopoldo Galicia, César Hinojo, Graciela Jiménez-Guzmán and Edelmira Rodríguez Alcantar
Fire 2026, 9(5), 204; https://doi.org/10.3390/fire9050204 - 15 May 2026
Abstract
Wildfire prediction using machine learning (ML) and deep learning (DL) has expanded rapidly, yet synthesis regarding algorithmic configurations, data practices, and transparency remains limited. This systematic review characterizes ML/DL applications in wildfire prediction (2020–2025) using a PRISMA-EcoEvo framework across 341 peer-reviewed studies, with [...] Read more.
Wildfire prediction using machine learning (ML) and deep learning (DL) has expanded rapidly, yet synthesis regarding algorithmic configurations, data practices, and transparency remains limited. This systematic review characterizes ML/DL applications in wildfire prediction (2020–2025) using a PRISMA-EcoEvo framework across 341 peer-reviewed studies, with detailed analysis of 110 articles from 2024. Publication output increased steadily, concentrated geographically in China and the United States. Methodologically, ensemble tree-based methods (26.7%) and deep learning architectures (59.4%) coexist, reflecting adaptation to diverse data modalities. Input data are dominated by vegetation/fuel characteristics (44.7%) and historical fire labels (41.2%), while socioeconomic variables remain marginal (1.2%). Evaluation practices distinguish classification and regression tasks, yet metric heterogeneity constrains cross-study comparability. Critically, only 7.7% of studies provided publicly accessible code, with a significant association between algorithm family and code availability (χ2 = 78, p = 0.0012). Collectively, wildfire ML/DL research demonstrates technical advancement but remains geographically concentrated and constrained by limited transparency. Strengthening reporting standards, metric-task alignment, dataset documentation, and open-code practices is essential to translate computational innovation into globally robust, reproducible wildfire decision-support systems. Full article
Show Figures

Figure 1

26 pages, 3369 KB  
Article
Performance of Global Land Use Land Cover Products for Southwest China Karst
by Chunhua Zhang, Xiangkun Qi, Hoi Shan Cheung, Mingyang Zhang, Yuemin Yue and Kelin Wang
Remote Sens. 2026, 18(10), 1573; https://doi.org/10.3390/rs18101573 - 14 May 2026
Abstract
Accurate land use and land cover (LULC) data are essential for effective environmental management and reliable ecological modeling within complex landscapes such as the karst region of Southwest China. While new 10 m resolution global LULC products (i.e., ESA WorldCover, ESRI Land Cover, [...] Read more.
Accurate land use and land cover (LULC) data are essential for effective environmental management and reliable ecological modeling within complex landscapes such as the karst region of Southwest China. While new 10 m resolution global LULC products (i.e., ESA WorldCover, ESRI Land Cover, and annual mode composite of Dynamic World (DW)) offer unprecedented spatial detail, their reliability in heterogeneous karst remains poorly understood. We evaluated the accuracy and spatial consistency of these products for 2021 in the karst regions across five provinces in Southwest China using 1416 reference points collected through stratified random sampling. The ESA WorldCover dataset outperformed the others, achieving the highest overall accuracy (79.39 ± 2.19%). ESRI’s shrub metrics, however, reflect the structural absence of this class from its 2021 product rather than classification error. ESA’s superior performance in preserving fine-scale features is consistent with independent global assessments of both the 2020 and 2021 versions. This superior performance is attributed to its integration of Sentinel-1 SAR with optical data, a finer minimum mapping unit (100 m2), and expert-driven post-classification corrections. While all products successfully identified dominant classes like trees, substantial confusion emerged among spectrally similar classes such as shrubs, grass, and crops. A key finding was the strong effect of landscape heterogeneity on accuracy. Classification accuracy was 19.37% lower at patch edges (67.38%) compared to patch interiors (86.75%). Furthermore, edge reference points contribute disproportionately to total errors. Critically, none of the three products currently provide a sufficient basis for shrub-focused ecological monitoring in this region: ESA rarely detected shrub cover, DW mapped extensive but largely inaccurate shrub areas, and ESRI eliminated the shrub class from its 2021 product. These results show that while global 10 m products provide valuable information, careful product selection and regional validation remain essential for heterogeneous karst environments. Future improvements should integrate multi-source data (optical + synthetic aperture radar), apply topographic compensation for shadow effects, and develop region-specific approaches for mapping vegetation transitions. Full article
(This article belongs to the Section Environmental Remote Sensing)
Show Figures

Figure 1

17 pages, 1399 KB  
Article
Interpretable Two-Stage Machine Learning for Early and Full Drug Release Prediction in PLGA Microspheres
by Younghun Song, Saroj Bashyal, Hyuk Jun Cho, Mi Ran Woo, Jong Oh Kim and Duhyeong Hwang
Pharmaceuticals 2026, 19(5), 767; https://doi.org/10.3390/ph19050767 (registering DOI) - 14 May 2026
Abstract
Background/Objectives: Poly(lactic-co-glycolic acid) (PLGA) microspheres are widely used in long-acting injectable (LAI) formulations because PLGA exhibits well-established biocompatibility and undergoes controlled hydrolytic degradation into metabolizable byproducts. However, optimization of microspheres typically requires time-consuming in vitro testing. Therefore, we developed a predictive machine learning [...] Read more.
Background/Objectives: Poly(lactic-co-glycolic acid) (PLGA) microspheres are widely used in long-acting injectable (LAI) formulations because PLGA exhibits well-established biocompatibility and undergoes controlled hydrolytic degradation into metabolizable byproducts. However, optimization of microspheres typically requires time-consuming in vitro testing. Therefore, we developed a predictive machine learning model for early-stage and full time-dependent release profiles of drug-loaded PLGA microspheres. Methods: Using a published dataset comprising 321 release profiles from 89 drugs, we first developed a classification model to identify slow-release behavior (≤20% release within 3 days) and subsequently integrated the predicted early-release probability into a regression model to estimate cumulative release over time. Results: Among tree-based ensemble models, XGBoost achieved the lowest mean absolute error (MAE = 0.126) and highest Pearson correlation coefficient (r = 0.831). SHapley Additive exPlanations (SHAP) analysis revealed that drug and polymer molecular weight, predictive slow-release probability, and polymer concentration substantially influence release behavior. We also assessed this framework with external datasets. Drug release data for olaparib-loaded PLGA microspheres were obtained in-house, whereas those for semaglutide-based microspheres were obtained from the published literature. In both datasets, this framework demonstrated low MAE values (0.096 and 0.068, respectively). Conclusions: This suggests that the proposed framework can predict in vitro drug release and support efficient optimization of PLGA-based LAI formulations. Full article
(This article belongs to the Section Pharmaceutical Technology)
Show Figures

Graphical abstract

23 pages, 1199 KB  
Systematic Review
The Bridge Between Artificial Intelligence and Predictive Maintenance in Industry 4.0: A Systematic Review
by Daniel Arez, Helena V. G. Navas and Pedro Gaspar
Appl. Sci. 2026, 16(10), 4882; https://doi.org/10.3390/app16104882 - 14 May 2026
Viewed by 92
Abstract
This systematic literature review explores the intersection of Artificial Intelligence (AI) and Predictive Maintenance (PdM) within Industry 4.0. Using a PRISMA-based methodology, 123 studies published between 2014 and April 2024 were analyzed to characterize technological trends, algorithmic choices, industrial applications, and evaluation practices. [...] Read more.
This systematic literature review explores the intersection of Artificial Intelligence (AI) and Predictive Maintenance (PdM) within Industry 4.0. Using a PRISMA-based methodology, 123 studies published between 2014 and April 2024 were analyzed to characterize technological trends, algorithmic choices, industrial applications, and evaluation practices. The review reveals a consistent growth of research interest, driven by the widespread adoption of Internet of Things (IoT) devices and increased data availability. The manufacturing sector dominates the literature, although most studies rely on standardized datasets rather than real industrial environments. Among the identified AI methods, Random Forest (RF), Support Vector Machine (SVM), Decision Tree (DT) and K-Nearest Neighbors (KNNs) represent the most frequently applied algorithms for tasks such as failure prediction, fault detection, and remaining useful life (RUL) estimation. Model performance is commonly evaluated with Accuracy (Acc), Precision, Recall, F1-Score, and Root Mean Square Error (RMSE), reflecting the prevalence of both classification and regression-based PdM analyses. Despite significant advances, this review identifies persistent gaps, including limited domain diversity, scarce long-term real-world validation, and insufficient use of eXplainable AI (XAI) techniques. The findings highlight the need for broader domain coverage, improved interpretability, and validation under realistic industrial conditions. Overall, this review consolidates current knowledge on AI-enabled PdM and outlines critical directions to enhance reliability, transparency, and industrial relevance in the context of Industry 4.0. Full article
Show Figures

Figure 1

15 pages, 1288 KB  
Article
Feasibility Study of Noninvasive Subcutaneous Imaging for Vein Localization
by Sen Bing, Mao-Hsiang Huang, Hung Cao and J.-C. Chiao
Electronics 2026, 15(10), 2082; https://doi.org/10.3390/electronics15102082 - 13 May 2026
Viewed by 19
Abstract
This work presents a noninvasive imaging method to locate veins using a tuned microwave loop resonator. It offers a low-cost, fast, and effective solution to the challenges in venipuncture. The sensor features a loop resonator with a 5.2 mm radius, incorporating a self-tuning [...] Read more.
This work presents a noninvasive imaging method to locate veins using a tuned microwave loop resonator. It offers a low-cost, fast, and effective solution to the challenges in venipuncture. The sensor features a loop resonator with a 5.2 mm radius, incorporating a self-tuning mechanism, and operates at 2.408 GHz with a reflection coefficient of −48.77 dB. It generates localized high-intensity electric fields that penetrate tissues to sufficient depths, enabling the detection of veins based on shifts in resonant frequencies that are induced by the varied dielectric properties of blood vessels. Two-dimensional raster scan simulations of the cephalic and median cubital veins yielded a ∼25 MHz downward resonant-frequency shift between vein and non-vein positions, with the median cubital vein still detectable at depths up to 6 mm. To quantify generalization to real tissues, a decision tree classifier trained on 63 simulation samples and evaluated on 335 in vivo measurements achieved 82.09% classification accuracy (sensitivity 81.25%, specificity 83.02%), demonstrating that the simulation-derived frequency contrast transfers reliably to experimental data despite inter-subject tissue variability. Extensive tests conducted demonstrate the sensor’s effectiveness, producing consistent and distinguishable frequency shifts when the sensor moves on the skin across veins. This technology holds significant promise for improving venipuncture accuracy, minimizing complications, and enhancing patient comfort. Full article
29 pages, 2181 KB  
Article
Geographical Origin Discrimination of Aniseed (Pimpinella anisum) Based on Machine Learning Classification of Agricultural and GC-MS Parameters
by Milica Aćimović, Biljana Lončar, Olja Šovljanski, Ana Tomić, Vanja Travičić, Milada Pezo, Vladimir Filipović, Danijela Šuput, Darko Micić and Lato Pezo
AgriEngineering 2026, 8(5), 194; https://doi.org/10.3390/agriengineering8050194 - 13 May 2026
Viewed by 75
Abstract
The geographical origin of aniseed (Pimpinella anisum L.) represents a key quality determinant, as it directly influences the chemical composition and commercial value of its essential oil. Agronomic traits of aniseed (plant height, umbel diameter, number of umbels per plant), productivity-related traits [...] Read more.
The geographical origin of aniseed (Pimpinella anisum L.) represents a key quality determinant, as it directly influences the chemical composition and commercial value of its essential oil. Agronomic traits of aniseed (plant height, umbel diameter, number of umbels per plant), productivity-related traits (number of seeds, thousand-seed weight, yield per plant, plant biomass, harvest index, yield per hectare, essential oil content and yield), and physiological traits (germination energy and total germination) exhibit variations depending on geographical origin. The study proposes an integrated framework for accurate classification by combining agronomic, productivity, and physiological data with GC-MS profiles and advanced machine learning (ML) techniques. A total of 144 samples were analyzed, based on a factorial design including three locations, six fertilizer treatments, two years, and four replications. trans-Anethole was the dominant compound in all samples (89.508–101.441%). Several classification models, including artificial neural networks, random forests, MARSplines, boosted trees, interactive trees, naïve Bayes, and support vector machines, were evaluated to discriminate samples by geographical origin using agro-meteorological and GC-MS data. The results indicate that AI and ML approaches effectively captured complex non-linear relationships. Overall, the multi-model framework highlights the strong potential of machine learning for agro-food authentication, supporting improved traceability, site-specific decision-making, and quality control. Full article
Show Figures

Figure 1

19 pages, 953 KB  
Article
Subject-Wise Depression Screening from Eight-Channel Resting-State EEG Using Asymmetry-Aware Spectral Features and Connectivity Ablation
by Hassan Ugail, Newton Howard, Ali Ahmed Elmahmudi and Zied Mnasri
Sensors 2026, 26(10), 3065; https://doi.org/10.3390/s26103065 - 12 May 2026
Viewed by 364
Abstract
Major depressive disorder remains difficult to diagnose objectively, as routine assessment is still largely dependent on clinical interview and rating scales. Resting-state electroencephalography (EEG) is an attractive complementary modality because it is non-invasive, low-cost, and compatible with wearable sensing, but many reported EEG [...] Read more.
Major depressive disorder remains difficult to diagnose objectively, as routine assessment is still largely dependent on clinical interview and rating scales. Resting-state electroencephalography (EEG) is an attractive complementary modality because it is non-invasive, low-cost, and compatible with wearable sensing, but many reported EEG classification results are weakened by segment-level leakage and unclear subject identity handling. This study evaluates whether depression can be distinguished from healthy controls using a compact eight-channel resting-state EEG configuration under a strictly leakage-free subject-wise protocol. Using a widely used public EEG dataset, we first corrected a previously overlooked subject-identity ambiguity by constructing a class-aware composite key, yielding 56 valid unique participants. We then applied ten repeated subject-wise holdout splits and compared five compact baselines spanning Extra Trees and a multi-layer perceptron on asymmetry-aware spectral features and three convolutional networks on raw signals, including the EEG-specific EEGNet and ShallowConvNet architectures. Uncertainty was quantified through 95% bootstrap confidence intervals of the mean across repeats. The best model, an Extra Trees classifier using eight-channel spectral and asymmetry features, achieved a mean balanced accuracy of 93.5% with a 95% bootstrap confidence interval of 89.6% to 96.8% and a mean area under the receiver operating characteristic curve of 98.6% with a 95% bootstrap confidence interval of 96.2% to 100.0%. A connectivity ablation showed that inter-channel coherence was informative in isolation but did not improve performance when naively fused with spectral features. A feature-selection ablation did not show evidence that the 90-dimensional spectral representation was dominated by noisy or uninformative dimensions under this evaluation protocol. These results support compact, subject-wise evaluated EEG screening pipelines while highlighting the importance of rigorous leakage control. Full article
(This article belongs to the Section Wearables)
38 pages, 5046 KB  
Article
Using Sentinel-2 Time Series to Monitor the Loss of Individual Large Trees in Humanized Landscapes
by João Gonçalo Soutinho, Kerri T. Vierling, Lee A. Vierling, Jörg Müller and João F. Gonçalves
Remote Sens. 2026, 18(10), 1519; https://doi.org/10.3390/rs18101519 - 12 May 2026
Viewed by 337
Abstract
Large trees are keystone ecological structures that sustain biodiversity and ecosystem services, particularly in human-altered landscapes. However, their persistence is increasingly threatened by land-use change, urban expansion, and inadequate monitoring. This study develops and validates a scalable, automated framework for monitoring the loss [...] Read more.
Large trees are keystone ecological structures that sustain biodiversity and ecosystem services, particularly in human-altered landscapes. However, their persistence is increasingly threatened by land-use change, urban expansion, and inadequate monitoring. This study develops and validates a scalable, automated framework for monitoring the loss of large individual trees using satellite image time series and breakpoint detection. We compared four spectral indices (SIs): Enhanced Vegetation Index 2–EVI2; Normalized Burn Ratio–NBR; Normalized Difference Red Edge–NDRE, and the Normalized Difference Vegetation Index–NDVI derived from Sentinel-2 imagery (2015–2025) for 691 georeferenced trees in Lousada, northern Portugal. Data were accessed and processed in Google Earth Engine and analyzed using a custom R-based workflow, including cloud masking, gap-filling, temporal interpolation, upper-envelope smoothing, deseasonalization, and break detection. Five breakpoint detection algorithms were compared: BFAST, energy-divisive, linear regression of structural changes, wild-binary segmentation, and change point models. Detected breakpoints were subsequently post-validated to determine whether they were associated with declines in SIs, using three pre-/post-breakpoint methods: comparisons of short- and long-term medians and a randomized trend analysis. As a baseline, these algorithms/post-validation logic were compared against the Continuous Change Detection and Classification (CCDC) approach. The results indicate moderate but consistent break detection performance, with a maximum balanced accuracy of 73% (for EVI2 or NDVI and using the energy-divisive algorithm coupled with the long-term median post-validator) under conservative validation criteria and high specificity for surviving trees. CCDC ranked comparatively lower at 62%. Algorithm performance varied substantially, with the energy-divisive providing the most conservative detection and the wild-binary segmentation yielding higher sensitivity. Performance was further influenced by tree structural attributes and species identity, with larger, taller and isolated trees, as well as particular genera, showing higher detection accuracy, with genus Eucalyptus, Tilia and Celtis yielding top performance results (79–65%) and Quercus, Castanea and Platanus the lowest (62–60%). By integrating satellite observations with large-tree inventory data from the Green Giants citizen science project, this study demonstrates the potential of decentralized, Earth observation-based monitoring to support tree-level loss assessments in fragmented landscapes. The proposed framework provides a transferable foundation for wide-scale monitoring of large trees in peri-urban and mixed-use environments. Full article
(This article belongs to the Special Issue Urban Ecology Monitoring Using Remote Sensing)
Show Figures

Figure 1

18 pages, 27124 KB  
Article
Research on Plantar Signal Measurement and Foot Arch Classification
by Jinyu Zhu, Baoqing Nie and Chuanhao Yu
Electronics 2026, 15(10), 2051; https://doi.org/10.3390/electronics15102051 - 11 May 2026
Viewed by 183
Abstract
The foot arch functions as a dynamic biomechanical system, maintained by the integrated actions of bones, ligaments, and muscles. A large body of clinical evidence indicates that, in addition to congenital foot deformities, acquired variations in the foot arch caused by factors such [...] Read more.
The foot arch functions as a dynamic biomechanical system, maintained by the integrated actions of bones, ligaments, and muscles. A large body of clinical evidence indicates that, in addition to congenital foot deformities, acquired variations in the foot arch caused by factors such as poor gait, aging, weight, or injury can significantly affect quality of life. Early intervention upon detection of foot arch changes can help mitigate progression and prevent further deterioration. Despite the availability of multimodal sensor-integrated running platforms for gait analysis, such systems are inherently bulky and not conducive to routine walking measurement. To overcome the above limitations, this study employed a flexible plantar pressure insole with an integrated accelerometer and a dedicated acquisition circuit to capture plantar pressure and acceleration data. This smart insole system acquires plantar data, performs feature extraction via time–domain and wavelet analysis, and then employs machine learning to classify the foot arch type as a normal foot, flatfoot, or high-arched. A Random Forest classifier was then established to categorize foot arch types based on the collected data, which integrates numerous decision trees through bootstrap aggregation and random feature selection, with final classification determined by majority voting. A total of 30 volunteers participated, including 11 with normal arches, 11 with flat feet, and 8 with high arches. Compared with support vector machine, K nearest neighbors, and decision tree, the Random Forest achieved the highest recognition accuracy of 92%. This system reveals the patterns of plantar pressure distribution and acceleration fluctuations during walking across three foot arches and demonstrates that wavelet entropy can effectively quantify the changes in signal complexity included in foot arch differences. Compared with laboratory force plates, this system features lower cost and a smaller form factor, making it suitable for real-time monitoring. This system can lay the technical foundation for personalized foot orthopedics and health monitoring. Full article
Show Figures

Figure 1

28 pages, 1863 KB  
Article
Explaining Global Happiness: Evidence from Decision Trees and Necessary Condition Analysis
by Teresa Torres-Coronas and Jorge de Andrés-Sánchez
Economies 2026, 14(5), 172; https://doi.org/10.3390/economies14050172 - 11 May 2026
Viewed by 267
Abstract
In this paper, the factors linked to happiness are analyzed on the basis of the World Happiness Report (WHR) model, introducing a methodological approach that differs from traditional econometric techniques. More specifically, this study examines how the core variables of the WHR model [...] Read more.
In this paper, the factors linked to happiness are analyzed on the basis of the World Happiness Report (WHR) model, introducing a methodological approach that differs from traditional econometric techniques. More specifically, this study examines how the core variables of the WHR model interact in relation to happiness and whether some of them also emerge as necessary conditions within the framework of necessary condition analysis (NCA) for attaining higher happiness levels. Using Gallup World Poll data for the 2022–2024 period, the Cantril Ladder is employed as a measure of subjective well-being, and gross domestic product (GDP) per capita, social support, healthy life expectancy, freedom, generosity, and perceived corruption are considered explanatory variables. This study makes two contributions. First, it applies a decision tree regression model to identify interactions among the correlates of happiness while also facilitating the classification of countries into homogeneous groups according to their well-being configurations. This approach improves interpretability relative to linear models because it does not require prior specification of those interactions. Second, this paper incorporates necessary condition analysis to distinguish between factors that are merely influential and those that emerge as necessary conditions for attaining certain levels of happiness. These assessments make it possible to identify minimum thresholds in key variables, introducing a necessary-condition logic. The results show that social support and GDP per capita emerge as the main structuring variables in the tree and are strongly associated with differences in happiness, whereas freedom emerges as a prominent condition in the NCA results. The findings also show that some factors with low correlation may still play a relevant role in specific contexts because of nonlinear effects and interactions. Overall, the results of this study offer an analytical reinterpretation of the WHR model by combining structural segmentation and threshold identification, advancing the understanding of happiness as a multidimensional, nonlinear phenomenon associated with specific configurations of factors. Full article
(This article belongs to the Section Health Economics)
Show Figures

Figure 1

37 pages, 2804 KB  
Article
An Explainable XGBoost-Based Framework for IoT Attack Detection with Unseen Attack Family Evaluation
by Ruei-Jan Hung
Sensors 2026, 26(10), 3005; https://doi.org/10.3390/s26103005 - 10 May 2026
Viewed by 590
Abstract
The rapid growth of the Internet of Things (IoT) has introduced significant cybersecurity challenges due to the heterogeneity, scale, and limited protection capability of connected devices. Although machine learning has been widely adopted for IoT intrusion detection, many existing studies still rely primarily [...] Read more.
The rapid growth of the Internet of Things (IoT) has introduced significant cybersecurity challenges due to the heterogeneity, scale, and limited protection capability of connected devices. Although machine learning has been widely adopted for IoT intrusion detection, many existing studies still rely primarily on closed-world evaluation settings, unequal baseline comparison budgets, fixed decision thresholds, and limited integration of explainability into model assessment. To address these issues, this paper proposes an explainable XGBoost-based framework for IoT attack detection with unseen attack family evaluation using the large-scale CICIoT2023 dataset. In the proposed framework, IoT traffic is formulated as a binary classification task that distinguishes benign from malicious flows. The study integrates two complementary evaluation protocols: (1) closed-world stratified 10-fold cross-validation for in-distribution performance assessment and (2) unseen attack family evaluation, in which one malicious family is excluded from training and used only for testing under a zero-day-like but single-dataset condition. A fair-budget experimental design is adopted to compare seven representative models under the same training budget, including default XGBoost, optimized XGBoost, Random Forest, LightGBM, CatBoost, Logistic Regression, and a simple multilayer perceptron. To improve reproducibility and operational validity, the revised framework further reports the sampling strategy, split-overlap audit, XGBoost hyperparameter search protocol, repeated unseen-family evaluation, validation-based threshold calibration under fixed-FAR constraints, cost-sensitive threshold analysis, and XGBoost-native SHapley Additive exPlanations (SHAP) compatible feature contribution analysis. The closed-world results show that tree-based ensemble methods clearly outperform the linear and shallow neural baselines. Random Forest achieves the highest closed-world macro-F1 of 0.9713, followed by LightGBM with 0.9602 and optimized XGBoost with 0.9566. In the fair-budget unseen-family setting under the default threshold, Random Forest again obtains the highest mean macro-F1 of 0.8433 and the lowest false negative rate (FNR) of 0.0712, but it also produces a substantially higher false alarm rate (FAR = 0.0536). By contrast, optimized XGBoost provides a lower-FAR default operating point, achieving a mean macro-F1 of 0.8194, Matthews correlation coefficient (MCC) of 0.7067, FAR of 0.0086, and FNR of 0.2996. Repeated unseen-family experiments over five random seeds confirm the same trade-off: Random Forest provides stronger recall-oriented detection, whereas optimized XGBoost provides a lower-FAR default operating point. After validation-based threshold calibration at an approximate FAR target of 0.01, Random Forest achieves the strongest calibrated recall-oriented performance, with macro-F1 of 0.8754, MCC of 0.7757, FNR of 0.2000, and attack recall of 0.8000. Optimized XGBoost remains competitive at the same FAR target, with macro-F1 of 0.8323, MCC of 0.7193, FNR of 0.2760, and attack recall of 0.7240. The explainability analysis indicates that the optimized XGBoost detector relies mainly on TCP control-flag, temporal, and packet-statistical features, with rst_count, IAT, urg_count, Tot size, Number, Header_Length, and Magnitude among the most influential variables. Local contribution tables for representative true-positive, false-positive, false-negative, and true-negative cases further improve the readability of the explanation results and confirm that native pred_contribs reconstructs the model margin with negligible numerical error. Overall, the results show that the most appropriate model depends on the deployment objective: Random Forest is preferable when minimizing missed attacks under a calibrated FAR constraint is prioritized, whereas optimized XGBoost remains a strong primary model for an explainable low-FAR XGBoost-based framework that emphasizes scalability, operational conservativeness, and native contribution-based interpretation. Full article
(This article belongs to the Special Issue Internet of Things Cybersecurity)
Show Figures

Figure 1

24 pages, 6760 KB  
Article
Overcoming Generalization Issues in Flood Prediction: A Machine Learning Approach Across Multiple Basins
by Ufuk Yükseler, Omerul Faruk Dursun, Mete Yağanoğlu and Abdolmajid Mohammadian
Sustainability 2026, 18(10), 4724; https://doi.org/10.3390/su18104724 - 9 May 2026
Viewed by 174
Abstract
Flooding is a complex, unpredictable disaster that occurs frequently and can have devastating impacts. Over the past two decades, the advent of machine learning (ML) methods has led to a surge in studies focused on flood prediction, emphasizing high-performance algorithms and fast processing [...] Read more.
Flooding is a complex, unpredictable disaster that occurs frequently and can have devastating impacts. Over the past two decades, the advent of machine learning (ML) methods has led to a surge in studies focused on flood prediction, emphasizing high-performance algorithms and fast processing times. The present study aims to investigate the challenges of generalization in flood prediction models using machine learning techniques. A dataset of 18,810 samples was compiled from 40 river basins covering the period 1959–2020. Nine machine learning algorithms were applied to the analysis: Logistic Regression, Support Vector Machine, K-Nearest Neighbors, Decision Tree, Random Forest, AdaBoost, Gradient Boosting, Extra Trees, and Gaussian Naive Bayes. Four distinct validation methods were employed to assess the performance of the models, and the results were thoroughly analyzed. The Gradient Boosting model demonstrated exceptional validation performance indicating its robustness across diverse datasets. High accuracy was also observed in the Decision Tree, Random Forest, Extra Trees, and AdaBoost models. However, for datasets with fewer than 200 samples, these four models experienced a decline in performance. Elevation was identified as the most important factor influencing flooding in 36 basins. NDVI was the dominant factor in 3 basins, while rainfall was the main driver in only 1 basin. The results highlight the contributions and shortcomings of machine learning methods in sustainable flood disaster management systems. Full article
(This article belongs to the Section Sustainable Engineering and Science)
25 pages, 1054 KB  
Article
DNA Barcoding and Allele-Specific PCR Discrimination of Glasswort Ecotypes from Apulia Region (Southern Italy)
by Angelica Giancaspro, Giulia Conversa, Luigi Giuseppe Duri, Gaetana Ricatti, Antonio Elia, Stefano Pavan and Concetta Lotti
Agronomy 2026, 16(10), 947; https://doi.org/10.3390/agronomy16100947 - 8 May 2026
Viewed by 204
Abstract
In the scenario of ongoing climate changes, the selection of plant genotypes with high salt tolerance is emerging as the most sustainable strategy to safeguard crop yield and quality and make productive use of salinized soils. Glassworts are annual and perennial halophytes found [...] Read more.
In the scenario of ongoing climate changes, the selection of plant genotypes with high salt tolerance is emerging as the most sustainable strategy to safeguard crop yield and quality and make productive use of salinized soils. Glassworts are annual and perennial halophytes found in inner and coastal wastelands, indistinctly consumed as high-nutritional green vegetables. Traditional taxonomic classification based on morphological traits can be very challenging in glasswort, due to phenotypic plasticity, reduced plant morphology, and inbreeding. In this work, we used DNA-based molecular tools to overcome such constraints and assess inter-generic and inter-specific genetic diversity in a collection of ecotypes from different Apulian areas. A fast and reliable Allele-Specific PCR assay was optimized to enable molecular detection of annual and perennial genera. Species-level classification was obtained through a similarity- and phylogeny-based approach relying on matK and rbcL DNA barcoding. Combined DNA tools identified perennial samples as Sarcocornia fruticosa and Arthrocaulon macrostachyum, along with annual Salicornia europaea, and phylogenetic trees unveiled genetic distances between glassworts, which clustered according to life cycle. The relationship between genotypes and nutritional profiles was finally investigated, suggesting that environmental factors may play a predominant role over taxonomic relatedness in shaping interspecific differences in nutrient composition of the analyzed samples. Full article
33 pages, 10584 KB  
Article
Beyond Accuracy in AI: A Multi-Objective Benchmark of Inductive Bias, Robustness, Computational Efficiency, and Pareto-Optimal Trade-Off
by Hüseyin Enes Okutan and Muhammet Baykara
Appl. Sci. 2026, 16(10), 4637; https://doi.org/10.3390/app16104637 - 8 May 2026
Viewed by 238
Abstract
Nonlinear classification problems such as XOR are widely used to evaluate machine learning models beyond linear separability. In this study, a comprehensive benchmark is proposed to analyze eight classifiers (Logistic Regression, Linear SVM, RBF SVM, Decision Tree, Random Forest, KNN, MLP_small, MLP_deep) across [...] Read more.
Nonlinear classification problems such as XOR are widely used to evaluate machine learning models beyond linear separability. In this study, a comprehensive benchmark is proposed to analyze eight classifiers (Logistic Regression, Linear SVM, RBF SVM, Decision Tree, Random Forest, KNN, MLP_small, MLP_deep) across four XOR variants (clean, noisy, rotated, high-dimensional). A total of 640 controlled experiments are conducted using multiple sample sizes and random seeds. Models are evaluated using a multi-objective framework including accuracy, training and inference time, memory usage, energy consumption, and model size. Results show that MLP_deep achieves the highest overall accuracy, while MLP_small provides competitive performance with significantly lower computational cost. Decision Tree offers a strong balance between efficiency and accuracy, whereas Random Forest achieves competitive accuracy at higher resource usage. High-dimensional XOR is the most challenging scenario, significantly reducing overall performance across models. Pareto frontier analysis further highlights optimal trade-offs between predictive performance and resource efficiency. The study demonstrates that no single model is universally optimal and emphasizes the importance of resource-aware model selection in nonlinear classification tasks. Full article
Show Figures

Figure 1

Back to TopTop