Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (615)

Search Parameters:
Keywords = Bag of Features

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
41 pages, 3681 KB  
Article
Proof-of-Concept Machine Learning Framework for Arboviral Disease Classification Using Literature-Derived Synthetic Data: Methodological Development Preceding Clinical Validation
by Elí Cruz-Parada, Guillermina Vivar-Estudillo, Laura Pérez-Campos Mayoral, María Teresa Hernández-Huerta, Alma Dolores Pérez-Santiago, Carlos Romero-Diaz, Eduardo Pérez-Campos Mayoral, Iván A. García Montalvo, Lucia Martínez-Martínez, Héctor Martínez-Ruiz, Idarh Matadamas, Miriam Emily Avendaño-Villegas, Margarito Martínez Cruz, Hector Alejandro Cabrera-Fuentes, Aldo-Eleazar Pérez-Ramos, Eduardo Lorenzo Pérez-Campos and Carlos Mauricio Lastre-Domínguez
Healthcare 2026, 14(2), 247; https://doi.org/10.3390/healthcare14020247 - 19 Jan 2026
Viewed by 28
Abstract
Background/Objectives: Arboviral diseases share common vectors, geographic distribution, and symptoms. Developing Machine Learning diagnostic tools for co-circulating arboviral diseases faces data-scarcity challenges. This study aimed to demonstrate that proof of concept using synthetic data can establish computational feasibility and guide future real-world [...] Read more.
Background/Objectives: Arboviral diseases share common vectors, geographic distribution, and symptoms. Developing Machine Learning diagnostic tools for co-circulating arboviral diseases faces data-scarcity challenges. This study aimed to demonstrate that proof of concept using synthetic data can establish computational feasibility and guide future real-world validation efforts. Methods: We assembled a synthetic dataset of 28,000 records, with 7000 for each disease—Dengue, Zika, and Chikungunya—plus Influenza as a negative control. These records were obtained from the existing literature. A binary matrix with 67 symptoms was created for detailed statistical analysis using Odds Ratios, Chi-Square, and symptom-specific conditional prevalence to validate the clinical relevance of the simulated data. This dataset was used to train and evaluate various algorithms, including Multi-Layer Perceptron (MLP), Narrow Neural Network (NN), Quadratic Support Vector Machine (QSVM), and Bagged Tree (BT), employing multiple performance metrics: accuracy, precision, sensitivity, specificity, F1-score, AUC-ROC, and Cohen’s kappa coefficient. Results: The dataset aligns with the PAHO guidelines. Similar findings are observed in other arboviral databases, confirming the validity of the synthetic dataset. A notable performance across all evaluated metrics was observed. The NN model achieved an overall accuracy of 0.92 and an AUC above 0.98, with precision, sensitivity, and specificity values exceeding 0.85, and an average Uniform Cohen’s Kappa of 0.89, highlighting its ability to reliably distinguish between Dengue and Influenza, with a slight decrease between Zika and Chikungunya. Conclusions: These models could accelerate early diagnosis of arboviral diseases by leveraging encoded symptom features for Machine Learning and Deep Learning approaches, serving as a support tool in regions with limited healthcare access without replacing clinical medical expertise. Full article
27 pages, 3924 KB  
Article
Research and Optimization of Soil Major Nutrient Prediction Models Based on Electronic Nose and Improved Extreme Learning Machine
by He Liu, Yuhang Cao, Haoyu Zhao, Jiamu Wang, Changlin Li and Dongyan Huang
Agriculture 2026, 16(2), 174; https://doi.org/10.3390/agriculture16020174 - 9 Jan 2026
Viewed by 177
Abstract
Keeping the levels of soil major nutrients (total nitrogen, TN; available phosphorous, AP; and available potassium, AK) in optimum condition is important to achieve the goals of precision agriculture systems. To address the issues of slow speed and low accuracy in soil nutrient [...] Read more.
Keeping the levels of soil major nutrients (total nitrogen, TN; available phosphorous, AP; and available potassium, AK) in optimum condition is important to achieve the goals of precision agriculture systems. To address the issues of slow speed and low accuracy in soil nutrient detection, this study developed a prediction model for soil major nutrients content based on an improved Extreme Learning Machine (ELM) algorithm. This model utilizes a soil major nutrients detection system integrating pyrolysis and artificial olfaction. First, the Bootstrap Aggregating (Bagging) ensemble strategy was introduced during the model integration phase to effectively reduce prediction variance through multi-submodel fusion. Second, Generative Adversarial Networks (GAN) were employed for sample augmentation, enhancing the diversity and representativeness of the dataset. Subsequently, a multi-scale convolutional and Efficient Lightweight Attention Network (ELA-Net) was embedded in the feature mapping layer to strengthen the representation capability of soil gas features. Finally, adaptive hyperparameter tuning was achieved using the Adaptive Chaotic Bald Eagle Optimization Algorithm (ACBOA) to enhance the model’s generalization capability. Results demonstrate that this model achieves varying degrees of performance improvement in predicting total nitrogen (R2 = 0.894), available phosphorus (R2 = 0.728), and available potassium (R2 = 0.706). Overall prediction accuracy surpasses traditional models by 8–12%, with significant reductions in both RMSE and MAE. These results demonstrate that the method can rapidly, accurately, and non-destructively estimate key soil nutrients, providing theoretical guidance and practical support for field fertilization, soil fertility assessment, and on-site decision-making in precision agriculture. Full article
(This article belongs to the Section Agricultural Soils)
Show Figures

Figure 1

22 pages, 1021 KB  
Article
A Multiclass Machine Learning Framework for Detecting Routing Attacks in RPL-Based IoT Networks Using a Novel Simulation-Driven Dataset
by Niharika Panda and Supriya Muthuraman
Future Internet 2026, 18(1), 35; https://doi.org/10.3390/fi18010035 - 7 Jan 2026
Viewed by 241
Abstract
The use of resource-constrained Low-Power and Lossy Networks (LLNs), where the IPv6 Routing Protocol for LLNs (RPL) is the de facto routing standard, has increased due to the Internet of Things’ (IoT) explosive growth. Because of the dynamic nature of IoT deployments and [...] Read more.
The use of resource-constrained Low-Power and Lossy Networks (LLNs), where the IPv6 Routing Protocol for LLNs (RPL) is the de facto routing standard, has increased due to the Internet of Things’ (IoT) explosive growth. Because of the dynamic nature of IoT deployments and the lack of in-protocol security, RPL is still quite susceptible to routing-layer attacks like Blackhole, Lowered Rank, version number manipulation, and Flooding despite its lightweight architecture. Lightweight, data-driven intrusion detection methods are necessary since traditional cryptographic countermeasures are frequently unfeasible for LLNs. However, the lack of RPL-specific control-plane semantics in current cybersecurity datasets restricts the use of machine learning (ML) for practical anomaly identification. In order to close this gap, this work models both static and mobile networks under benign and adversarial settings by creating a novel, large-scale multiclass RPL attack dataset using Contiki-NG’s Cooja simulator. To record detailed packet-level and control-plane activity including DODAG Information Object (DIO), DODAG Information Solicitation (DIS), and Destination Advertisement Object (DAO) message statistics along with forwarding and dropping patterns and objective-function fluctuations, a protocol-aware feature extraction pipeline is developed. This dataset is used to evaluate fifteen classifiers, including Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree (DT), k-Nearest Neighbors (KNN), Random Forest (RF), Extra Trees (ET), Gradient Boosting (GB), AdaBoost (AB), and XGBoost (XGB) and several ensemble strategies like soft/hard voting, stacking, and bagging, as part of a comprehensive ML-based detection system. Numerous tests show that ensemble approaches offer better generalization and prediction performance. With overfitting gaps less than 0.006 and low cross-validation variance, the Soft Voting Classifier obtains the greatest accuracy of 99.47%, closely followed by XGBoost with 99.45% and Random Forest with 99.44%. Full article
Show Figures

Graphical abstract

22 pages, 1777 KB  
Article
DP2PNet: Diffusion-Based Point-to-Polygon Conversion for Single-Point Supervised Oriented Object Detection
by Peng Li, Limin Zhang and Tao Qu
Sensors 2026, 26(1), 329; https://doi.org/10.3390/s26010329 - 4 Jan 2026
Viewed by 262
Abstract
Rotated Bounding Boxes (RBBs) for oriented object detection are labor-intensive and time-consuming to annotate. Single-point supervision offers a cost-effective alternative but suffers from insufficient size and orientation information, leading existing methods to rely heavily on complex priors and fixed refinement stages. In this [...] Read more.
Rotated Bounding Boxes (RBBs) for oriented object detection are labor-intensive and time-consuming to annotate. Single-point supervision offers a cost-effective alternative but suffers from insufficient size and orientation information, leading existing methods to rely heavily on complex priors and fixed refinement stages. In this paper, we propose DP2PNet (Diffusion-Point-to-Polygon Network), the first diffusion model-based framework for single-point supervised oriented object detection. DP2PNet features three key innovations: (1) A multi-scale consistent noise generator that replaces manual or external model priors with Gaussian noise, reducing dependency on domain-specific information; (2) A Noise Cross-Constraint module based on multi-instance learning, which selects optimal noise point bags by fusing receptive field matching and object coverage; (3) A Semantic Key Point Aggregator that aggregates noise points via graph convolution to form semantic key points, from which pseudo-RBBs are generated using convex hulls. DP2PNet supports dynamic adjustment of refinement stages without retraining, enabling flexible accuracy optimization. Extensive experiments on DOTA-v1.0 and DIOR-R datasets demonstrate that DP2PNet achieves 53.82% and 53.61% mAP50, respectively, comparable to methods relying on complex priors. It also exhibits strong noise robustness and cross-dataset generalization. Full article
Show Figures

Figure 1

12 pages, 465 KB  
Article
Using QR Codes for Payment Card Fraud Detection
by Rachid Chelouah and Prince Nwaekwu
Information 2026, 17(1), 39; https://doi.org/10.3390/info17010039 - 4 Jan 2026
Viewed by 263
Abstract
Debit and credit card payments have become the preferred method of payment for consumers, replacing paper checks and cash. However, this shift has also led to an increase in concerns regarding identity theft and payment security. To address these challenges, it is crucial [...] Read more.
Debit and credit card payments have become the preferred method of payment for consumers, replacing paper checks and cash. However, this shift has also led to an increase in concerns regarding identity theft and payment security. To address these challenges, it is crucial to develop an effective, secure, and reliable payment system. This research presents a comprehensive study on payment card fraud detection using deep learning techniques. The introduction highlights the significance of a strong financial system supported by a quick and secure payment system. It emphasizes the need for advanced methods to detect fraudulent activities in card transactions. The proposed methodology focuses on the conversion of a comma-separated values (CSV) dataset into quick response (QR) code images, enabling the application of deep neural networks and transfer learning. This representation allows leveraging pre-trained image-based architectures to provide a layer of privacy by encoding numeric transaction attributes into visual patterns. The feature extraction process involves the use of a convolutional neural network, specifically a residual network architecture. The results obtained through the under-sampling dataset balancing method revealed promising performance in terms of precision, accuracy, recall, and F1 score for the traditional models such as K-nearest neighbors (KNN), Decision tree, Random Forest, AdaBoost, Bagging, and Gaussian Naive Bayes. Furthermore, the proposed deep neural network model achieved high precision, indicating its effectiveness in detecting card fraud. The model also achieved high accuracy, recall, and F1 score, showcasing its superior performance compared to traditional machine learning models. In summary, this research contributes to the field of payment card fraud detection by leveraging deep learning techniques. The proposed methodology offers a sophisticated approach to detecting fraudulent activities in card payment systems, addressing the growing concerns of identity theft and payment security. By deploying the trained model in an Android application, real-time fraud detection becomes possible, further enhancing the security of card transactions. The findings of this study provide insights and avenues for future advancements in the field of payment card fraud detection. Full article
(This article belongs to the Section Information Security and Privacy)
Show Figures

Figure 1

21 pages, 2522 KB  
Article
Integrating SVR Optimization and Machine Learning-Based Feature Importance for TBM Penetration Rate Prediction
by Halil Karahan and Devrim Alkaya
Appl. Sci. 2026, 16(1), 355; https://doi.org/10.3390/app16010355 - 29 Dec 2025
Viewed by 420
Abstract
In this study, a Support Vector Regression (SVR) model was developed to predict the rate of penetration (ROP) during tunnel excavation, and its hyperparameters were optimized using Grid Search (GS), Random Search (RS), and Bayesian Optimization (BO). The results indicate that BO reached [...] Read more.
In this study, a Support Vector Regression (SVR) model was developed to predict the rate of penetration (ROP) during tunnel excavation, and its hyperparameters were optimized using Grid Search (GS), Random Search (RS), and Bayesian Optimization (BO). The results indicate that BO reached the optimal parameter set with only 30–50 evaluations, whereas GS and RS required approximately 1000 evaluations. In addition, BO achieved the highest predictive accuracy (R2 = 0.9625) while reducing the computational time from 25.83 s (GS) to 17.31 s. Compared with the baseline SVM model, the optimized SVR demonstrated high accuracy (R2 = 0.9610–0.9625), strong stability (NSE = 0.9194–0.9231), and low error levels (MAE = 0.0927–0.1099), clearly highlighting the critical role of hyperparameter optimization in improving model performance. To enhance interpretability, a feature importance analysis was conducted using four machine learning methods: Random Forest (RF), Bagged Trees (BT), Support Vector Machines (SVM), and the Generalized Additive Model (GAM). The relative contributions of BI, UCS, ALPHA, and DPW to ROP were evaluated, providing clearer insight into the model’s decision-making process and enabling more reliable engineering interpretation. Overall, integrating hyperparameter optimization with feature importance analysis significantly improves both predictive performance and model explainability. The proposed approach offers a robust, generalizable, and scientifically sound framework for TBM operations and geotechnical modeling applications. Full article
Show Figures

Figure 1

20 pages, 16800 KB  
Article
A Multi-Source Remote Sensing Identification Framework for Coconut Palm Mapping
by Tingting Wen, Ning Wang, Xiaoning Yao, Chunbo Li, Wenkai Bi and Xiao-Ming Li
Remote Sens. 2026, 18(1), 102; https://doi.org/10.3390/rs18010102 - 27 Dec 2025
Viewed by 242
Abstract
Coconut palms (Cocos nucifera L.) are a critical economic and ecological resource in Wenchang City, Hainan. Accurate mapping of their spatial distribution is essential for precision agricultural planning and effective pest and disease management. However, in tropical monsoon regions, persistent cloud cover, [...] Read more.
Coconut palms (Cocos nucifera L.) are a critical economic and ecological resource in Wenchang City, Hainan. Accurate mapping of their spatial distribution is essential for precision agricultural planning and effective pest and disease management. However, in tropical monsoon regions, persistent cloud cover, spectral similarity with other evergreen species, and redundancy among high-dimensional features hinder the performance of optical classification. To address these challenges, we developed a scalable multi-source remote sensing framework on the Google Earth Engine (GEE) with an emphasis on species-oriented feature design rather than generic feature stacking. The framework integrates Sentinel-1 SAR, Sentinel-2 MSI, and SRTM topographic data to construct a 42-dimensional feature set encompassing spectral, polarimetric, textural, and topographic attributes. Using Random Forest (RF) importance ranking and out-of-bag (OOB) error analysis, an optimal 15-feature subset was identified. Four feature combination schemes were designed to assess the contribution of each data source. The fused dataset achieved an overall accuracy (OA) of 92.51% (Kappa = 0.8928), while the RF-OOB optimized subset maintained a comparable OA of 92.83% (Kappa = 0.8975) with a 64% reduction in dimensionality. Canopy Water Index (CWI), Green Chlorophyll Index (GCI), and VV-polarized backscattering coefficient (σVV) were identified as the most discriminative features. Independent UAV validation (0.07 m resolution) in a 50 km2 area of Chongxing Town confirmed the model’s robustness (OA = 90.17%, Kappa = 0.8617). This study provides an efficient and robust framework for large-scale monitoring of tropical economic forests such as coconut palms. Full article
Show Figures

Figure 1

30 pages, 5219 KB  
Article
Dynamic Multi-Output Stacked-Ensemble Model with Hyperparameter Optimization for Real-Time Forecasting of AHU Cooling-Coil Performance
by Md Mahmudul Hasan, Pasidu Dharmasena and Nabil Nassif
Energies 2026, 19(1), 82; https://doi.org/10.3390/en19010082 - 23 Dec 2025
Viewed by 368
Abstract
This study introduces a dynamic, multi-output stacking framework for real-time forecasting of HVAC cooling-coil behavior in air-handling units. The dynamic model encodes short-horizon system memory with input/target lags and rolling psychrometric features and enforces leakage-free, time-aware validation. Four base learners—Random Forest, Bagging (DT), [...] Read more.
This study introduces a dynamic, multi-output stacking framework for real-time forecasting of HVAC cooling-coil behavior in air-handling units. The dynamic model encodes short-horizon system memory with input/target lags and rolling psychrometric features and enforces leakage-free, time-aware validation. Four base learners—Random Forest, Bagging (DT), XGBoost, and ANN—are each optimized with an Optuna hyperparameter tuner that systematically explores architecture and regularization to identify data-specific, near-optimal configurations. Their out-of-fold predictions are combined through a Ridge-based stacker, yielding state-of-the-art accuracy for supply-air temperature and chilled water leaving temperature (R2 up to 0.9995, NRMSE as low as 0.0105), consistently surpassing individual models. Novelty lies in the explicit dynamics encoding aligned with coil heat and mass-transfer behavior, physics-consistent feature prioritization, and a robust multi-target stacking design tailored for HVAC transients. The findings indicate that this hyperparameter-tuned dynamic framework can serve as a high-fidelity surrogate for cooling-coil performance, supporting set-point optimization, supervisory control, and future extensions to virtual sensing or fault-diagnostics workflows in industrial AHUs. Full article
(This article belongs to the Special Issue Performance Analysis of Building Energy Efficiency)
Show Figures

Figure 1

29 pages, 8414 KB  
Article
Optimized Explainable Machine Learning Protocol for Battery State-of-Health Prediction Based on Electrochemical Impedance Spectra
by Lamia Akther, Md Shafiul Alam, Mohammad Ali, Mohammed A. AlAqil, Tahmida Khanam and Md. Feroz Ali
Electronics 2025, 14(24), 4869; https://doi.org/10.3390/electronics14244869 - 10 Dec 2025
Viewed by 547
Abstract
Monitoring the battery state of health (SOH) has become increasingly important for electric vehicles (EVs), renewable storage systems, and consumer gadgets. It indicates the residual usable capacity and performance of a battery in relation to its original specifications. This information is crucial for [...] Read more.
Monitoring the battery state of health (SOH) has become increasingly important for electric vehicles (EVs), renewable storage systems, and consumer gadgets. It indicates the residual usable capacity and performance of a battery in relation to its original specifications. This information is crucial for the safety and performance enhancement of the overall system. This paper develops an explainable machine learning protocol with Bayesian optimization techniques trained on electrochemical impedance spectroscopy (EIS) data to predict battery SOH. Various robust ensemble algorithms, including HistGradientBoosting (HGB), Random Forest, AdaBoost, Extra Trees, Bagging, CatBoost, Decision Tree, LightGBM, Gradient Boost, and XGB, have been developed and fine-tuned for predicting battery health. Eight comprehensive metrics are employed to estimate the model’s performance rigorously: coefficient of determination (R2), mean squared error (MSE), median absolute error (medae), mean absolute error (MAE), correlation coefficient (R), Nash–Sutcliffe efficiency (NSE), Kling–Gupta efficiency (KGE), and root mean squared error (RMSE). Bayesian optimization techniques were developed to optimize hyperparameters across all models, ensuring optimal implementation of each algorithm. Feature importance analysis was performed to thoroughly evaluate the models and assess the features with the most influence on battery health degradation. The comparison indicated that the GradientBoosting model outperformed others, achieving an MAE of 0.1041 and an R2 of 0.9996. The findings suggest that Bayesian-optimized tree-based ensemble methods, particularly gradient boosting, excel at forecasting battery health status from electrochemical impedance spectroscopy data. This result offers an excellent opportunity for practical use in battery management systems that employ diverse industrial state-of-health assessment techniques to enhance battery longevity, contributing to sustainability initiatives for second-life lithium-ion batteries. This capability enables the recycling of vehicle batteries for application in static storage systems, which is environmentally advantageous and ensures continuity. Full article
(This article belongs to the Special Issue Advanced Control and Power Electronics for Electric Vehicles)
Show Figures

Figure 1

25 pages, 1358 KB  
Article
Incorporating Uncertainty in Machine Learning Models to Improve Early Detection of Flavescence Dorée: A Demonstration of Applicability
by Cristina Nuzzi, Erica Saldi, Ilaria Negri and Simone Pasinetti
Sensors 2025, 25(24), 7493; https://doi.org/10.3390/s25247493 - 9 Dec 2025
Viewed by 394
Abstract
Early detection of Flavescence dorée leaf symptoms remains an open question for the research community. This work tries to fill this gap by proposing a methodology exploiting per-pixel data obtained from hyperspectral imaging to produce features suitable for machine learning training. However, since [...] Read more.
Early detection of Flavescence dorée leaf symptoms remains an open question for the research community. This work tries to fill this gap by proposing a methodology exploiting per-pixel data obtained from hyperspectral imaging to produce features suitable for machine learning training. However, since asymptomatic samples are similar to healthy samples, we propose “uncertainty-aware” models that address the probability of the samples being similar, thus producing, as output, an “unclassified” category when the uncertainty between multiple classes is too high. The original dataset of leaves hypercubes was collected in a field of Pinot Noir in northern Italy during 2023 and 2024, for a total of 201 hypercubes equally divided into three classes (“healthy”, “asymptomatic”, “diseased”). Feature predictors were 4 for each of the 10 vegetation indices (population quartiles 25-50-75 and population’s mean), for a total of 40 predictors in total per leaf. Due to the low number of samples, it was not possible to estimate the uncertainty of the input data reliably. Thus, we adopted a double Monte Carlo procedure: First, we generated 30,000 synthetic hypercubes, thus computing the per class variance of each feature predictor. Second, we used this variance (serving as uncertainty of the input data) to generate 60,000 new predictors starting from the data in the test dataset. The trained models were therefore tested on these new data, and their predictions were further examined by a Bayesian test for validation purposes. It is highlighted that the proposed method notably improves recognition of “asymptomatic” samples with respect to the original models. The best model structure is the Decision Tree, achieving a prediction accuracy for “asymptomatic” samples of 75.7% against the original 49.3% for the Ensemble of Bagged Decision Trees (ML4) and of 44.6% against the original 13.2% for the Coarse Decision Tree (ML1). Full article
Show Figures

Figure 1

31 pages, 1941 KB  
Article
Boosting Traffic Crash Prediction Performance with Ensemble Techniques and Hyperparameter Tuning
by Naima Goubraim, Zouhair Elamrani Abou Elassad, Hajar Mousannif and Mohamed Ameksa
Safety 2025, 11(4), 121; https://doi.org/10.3390/safety11040121 - 9 Dec 2025
Viewed by 1254
Abstract
Road traffic crashes are a major global challenge, resulting in significant loss of life, economic burden, and societal impact. This study seeks to enhance the precision of traffic accident prediction using advanced machine learning techniques. This study employs an ensemble learning approach combining [...] Read more.
Road traffic crashes are a major global challenge, resulting in significant loss of life, economic burden, and societal impact. This study seeks to enhance the precision of traffic accident prediction using advanced machine learning techniques. This study employs an ensemble learning approach combining the Random Forest, the Bagging Classifier (Bootstrap Aggregating), the Extreme Gradient Boosting (XGBoost) and the Light Gradient Boosting Machine (LightGBM) algorithms. To address class imbalance and feature relevance, we implement feature selection using the Extra Trees Classifier and oversampling using the Synthetic Minority Over-sampling Technique (SMOTE). Rigorous hyperparameter tuning is applied to optimize model performance. Our results show that the ensemble approach, coupled with hyperparameter optimization, significantly improves prediction accuracy. This research contributes to the development of more effective road safety strategies and can help to reduce the number of road accidents. Full article
(This article belongs to the Special Issue Road Traffic Risk Assessment: Control and Prevention of Collisions)
Show Figures

Figure 1

24 pages, 575 KB  
Article
Sensitivity-Constrained Evolutionary Feature Selection for Imbalanced Medical Classification: A Case Study on Rotator Cuff Tear Surgery Prediction
by José María Belmonte, Fernando Jiménez, Gracia Sánchez, Santiago Gabardo, Natalia Martínez-Catalán, Emilio Calvo, Gregorio Bernabé and José Manuel García
Algorithms 2025, 18(12), 774; https://doi.org/10.3390/a18120774 - 8 Dec 2025
Viewed by 318
Abstract
While most patients with degenerative rotator cuff tears respond to conservative treatment, a minority progress to surgery. To anticipate these cases under class imbalance, we propose a sensitivity-constrained evolutionary feature selection framework prioritizing surgical-class recall, benchmarked against traditional methods. Two variants are proposed: [...] Read more.
While most patients with degenerative rotator cuff tears respond to conservative treatment, a minority progress to surgery. To anticipate these cases under class imbalance, we propose a sensitivity-constrained evolutionary feature selection framework prioritizing surgical-class recall, benchmarked against traditional methods. Two variants are proposed: (i) a single-objective search maximizing balanced accuracy and (ii) a multi-objective search also minimizing the number of selected features. Both enforce a minimum-sensitivity constraint on the minority class to limit false negatives. The dataset includes 347 patients (66 surgical, 19%) described by 28 clinical, imaging, symptom, and functional variables. We compare against 62 widely adopted pipelines, including oversampling, undersampling, hybrid resampling, cost-sensitive classifiers, and imbalance-aware ensembles. The main metric is balanced accuracy, with surgical-class F1-score as secondary. Pairwise Wilcoxon tests with a win–loss ranking assessed statistical significance. Evolutionary models rank among the top; the multi-objective variant with a Balanced Bagging Classifier performs best, achieving a mean balanced accuracy of 0.741. Selected subsets recurrently include age, tear location/severity, comorbidities, and pain/functional scores, matching clinical expectations. The constraint preserved minority-class recall without discarding or synthesizing data. Sensitivity-constrained evolutionary feature selection thus offers a data-preserving, interpretable solution for pre-surgical decision support, improving balanced performance and supporting safer triage decisions. Full article
Show Figures

Figure 1

27 pages, 11265 KB  
Article
Using Machine Learning Methods to Predict Cognitive Age from Psychophysiological Tests
by Daria D. Tyurina, Sergey V. Stasenko, Konstantin V. Lushnikov and Maria V. Vedunova
Healthcare 2025, 13(24), 3193; https://doi.org/10.3390/healthcare13243193 - 5 Dec 2025
Viewed by 354
Abstract
Background/Objectives: This paper presents the results of predicting chronological age from psychophysiological tests using machine learning regressors. Methods: Subjects completed a series of psychological tests measuring various cognitive functions, including reaction time and cognitive conflict, short-term memory, verbal functions, and color and spatial [...] Read more.
Background/Objectives: This paper presents the results of predicting chronological age from psychophysiological tests using machine learning regressors. Methods: Subjects completed a series of psychological tests measuring various cognitive functions, including reaction time and cognitive conflict, short-term memory, verbal functions, and color and spatial perception. The sample included 99 subjects, 68 percent of whom were men and 32 percent were women. Based on the test results, 43 features were generated. To determine the optimal feature selection method, several approaches were tested alongside the regression models using MAE, R2, and CV_R2 metrics. SHAP and Permutation Importance (via Random Forest) delivered the best performance with 10 features. Features selected through Permutation Importance were used in subsequent analyses. To predict participants’ age from psychophysiological test results, we evaluated several regression models, including Random Forest, Extra Trees, Gradient Boosting, SVR, Linear Regression, LassoCV, RidgeCV, ElasticNetCV, AdaBoost, and Bagging. Model performance was compared using the determination coefficient (R2) and mean absolute error (MAE). Cross-validated performance (CV_R2) was estimated via 5-fold cross-validation. To assess metric stability and uncertainty, bootstrapping (1000 resamples) was applied to the test set, yielding distributions of MAE and RMSE from which mean values and 95% confidence intervals were derived. Results: The study identified RidgeCV with winsorization and standardization as the best model for predicting cognitive age, achieving a mean absolute error of 5.7 years and an R2 of 0.60. Feature importance was evaluated using SHAP values and permutation importance. SHAP analysis showed that stroop_time_color and stroop_var_attempt_time were the strongest predictors, followed by several task-timing features with moderate contributions. Permutation importance confirmed this ranking, with these two features causing the largest performance drop when permuted. Partial dependence plots further indicated clear positive relationships between these key features and predicted age. Correlation analysis stratified by sex revealed that most features were significantly associated with age, with stronger effects generally observed in men. Conclusions: Feature selection revealed Stroop timing measures and task-related metrics from math and campimetry tests as the strongest predictors, reflecting core cognitive processes linked to aging. The results underscore the value of careful outlier handling, feature selection, and interpretable regularized models for analyzing psychophysiological data. Future work should include longitudinal studies and integration with biological markers to further improve clinical relevance. Full article
(This article belongs to the Special Issue AI-Driven Healthcare Insights)
Show Figures

Figure 1

17 pages, 2077 KB  
Article
Carbon Footprint of Plastic Bags and Polystyrene Dishes vs. Starch-Based Biodegradable Packaging in Amazonian Settlements
by Johanna Garavito, Néstor C. Posada, Clara P. Peña-Venegas and Diego A. Castellanos
Polymers 2025, 17(24), 3242; https://doi.org/10.3390/polym17243242 - 5 Dec 2025
Viewed by 842
Abstract
C footprint is a feature used to search the integral life cycle of a product to predict its environmental impact. The packaging industry is changing rapidly to the production of biodegradable products to mitigate the negative environmental consequences of the use of single-use [...] Read more.
C footprint is a feature used to search the integral life cycle of a product to predict its environmental impact. The packaging industry is changing rapidly to the production of biodegradable products to mitigate the negative environmental consequences of the use of single-use packages. It is thought that biodegradable packages should be more sustainable than traditional plastics due to the sources of the raw materials used to produce them, but this is not always true and depends on the issues considered, the methodology, and the scale analyzed. Limited research includes case studies from developing countries where waste management is less efficient and where the environmental impacts of single-use packaging can be more significant. This paper evaluates the C footprint of bags and dishes made from traditional or local biodegradable sources in an Amazonian settlement of Colombia, such as thermoplastic cassava starch and powdered plantain leaves, to evaluate the impact of locally made biodegradable packaging vs. imported petrochemical ones. Results show that using local raw materials and in situ production reduces the C footprint of biodegradable packages, considering that the energy source for production and transport are important contributors to the C footprint beyond the raw materials used, with ratios that can be between 0.1 and 7 times more kg CO2 eq generated per functional unit. Full article
(This article belongs to the Special Issue Applications of Biopolymer-Based Composites in Food Technology)
Show Figures

Graphical abstract

22 pages, 6983 KB  
Article
Bagging-PiFormer: An Ensemble Transformer Framework with Cross-Channel Attention for Lithium-Ion Battery State-of-Health Estimation
by Shaofang Wu, Jifei Zhao, Weihong Tang, Xuhui Liu and Yuqian Fan
Batteries 2025, 11(12), 447; https://doi.org/10.3390/batteries11120447 - 5 Dec 2025
Viewed by 431
Abstract
Accurate estimation of lithium-ion battery (LIB) state of health (SOH) is critical for prolonging battery life and ensuring safe operation. To address the limitations of existing data-driven models in robustness and feature coupling, this paper presents a new Bagging-PiFormer framework for SOH estimation. [...] Read more.
Accurate estimation of lithium-ion battery (LIB) state of health (SOH) is critical for prolonging battery life and ensuring safe operation. To address the limitations of existing data-driven models in robustness and feature coupling, this paper presents a new Bagging-PiFormer framework for SOH estimation. The framework integrates ensemble learning with an improved Transformer architecture to achieve accurate and stable performance across various degradation conditions. Specifically, multiple PiFormer base models are trained independently under the Bagging strategy to enhance generalization. Each PiFormer consists of a stack of PiFormer layers, which combines a cross-channel attention mechanism to model voltage–current interactions and a local convolutional feed-forward network (LocalConvFFN) to extract local degradation patterns from charging curves. Residual connections and layer normalization stabilize gradient propagation in deep layers, while a purely linear output head enables precise regression of the continuous SOH values. Experimental results on three datasets demonstrate that the proposed method achieves the lowest MAE, RMSE, and MAXE values among all compared models, reducing overall error by 10–33% relative to mainstream deep-learning methods such as Transformer, CNN-LSTM, and GCN-BiLSTM. These results confirm that the Bagging-PiFormer framework significantly improves both the accuracy and robustness of battery SOH estimation. Full article
(This article belongs to the Section Battery Performance, Ageing, Reliability and Safety)
Show Figures

Figure 1

Back to TopTop