MDPI - Publisher of Open Access Journals

21 pages, 2947 KB

Open AccessArticle

HFSOF: A Hierarchical Feature Selection and Optimization Framework for Ultrasound-Based Diagnosis of Endometrial Lesions

by Yongjun Liu, Zihao Zhang, Tongyu Chai and Haitong Zhao

Biomimetics 2026, 11(1), 74; https://doi.org/10.3390/biomimetics11010074 - 15 Jan 2026

Abstract

Endometrial lesions are common in gynecology, exhibiting considerable clinical heterogeneity across different subtypes. Although ultrasound imaging is the preferred diagnostic modality due to its noninvasive, accessible, and cost-effective nature, its diagnostic performance remains highly operator-dependent, leading to subjectivity and inconsistent results. To address [...] Read more.

Endometrial lesions are common in gynecology, exhibiting considerable clinical heterogeneity across different subtypes. Although ultrasound imaging is the preferred diagnostic modality due to its noninvasive, accessible, and cost-effective nature, its diagnostic performance remains highly operator-dependent, leading to subjectivity and inconsistent results. To address these limitations, this study proposes a hierarchical feature selection and optimization framework for endometrial lesions, aiming to enhance the objectivity and robustness of ultrasound-based diagnosis. Firstly, Kernel Principal Component Analysis (KPCA) is employed for nonlinear dimensionality reduction, retaining the top 1000 principal components. Secondly, an ensemble of three filter-based methods—information gain, chi-square test, and symmetrical uncertainty—is integrated to rank and fuse features, followed by thresholding with Maximum Scatter Difference Linear Discriminant Analysis (MSDLDA) for preliminary feature selection. Finally, the Whale Migration Algorithm (WMA) is applied to population-based feature optimization and classifier training under the constraints of a Support Vector Machine (SVM) and a macro-averaged F1 score. Experimental results demonstrate that the proposed closed-loop pipeline of “kernel reduction—filter fusion—threshold pruning—intelligent optimization—robust classification” effectively balances nonlinear structure preservation, feature redundancy control, and model generalization, providing an interpretable, reproducible, and efficient solution for intelligent diagnosis in small- to medium-scale medical imaging datasets. Full article

(This article belongs to the Special Issue Bio-Inspired AI: When Generative AI and Biomimicry Overlap)

► Show Figures

Figure 1

18 pages, 1289 KB

Open AccessArticle

Machine Learning-Based Automatic Diagnosis of Osteoporosis Using Bone Mineral Density Measurements

by Nilüfer Aygün Bilecik, Levent Uğur, Erol Öten and Mustafa Çapraz

J. Clin. Med. 2026, 15(2), 549; https://doi.org/10.3390/jcm15020549 - 9 Jan 2026

Viewed by 190

Abstract

Background: Osteoporosis and osteopenia are prevalent bone diseases characterized by reduced bone mineral density (BMD) and an increased risk of fractures, particularly in postmenopausal women. While dual-energy X-ray absorptiometry (DXA) remains the gold standard for diagnosis, it has limitations regarding accessibility, cost, and [...] Read more.

Background: Osteoporosis and osteopenia are prevalent bone diseases characterized by reduced bone mineral density (BMD) and an increased risk of fractures, particularly in postmenopausal women. While dual-energy X-ray absorptiometry (DXA) remains the gold standard for diagnosis, it has limitations regarding accessibility, cost, and predictive capacity for fracture risk. Machine learning (ML) approaches offer an opportunity to develop automated and more accurate diagnostic models by incorporating both BMD values and clinical variables. Method: This study retrospectively analyzed BMD data from 142 postmenopausal women, classified into 3 diagnostic groups: normal, osteopenia, and osteoporosis. Various supervised ML algorithms—including Support Vector Machines (SVM), k-Nearest Neighbors (k-NN), Decision Trees (DT), Naive Bayes (NB), Linear Discriminant Analysis (LDA), and Artificial Neural Networks (ANN)—were applied. Feature selection techniques such as ANOVA, CHI2, MRMR, and Kruskal–Wallis were used to enhance model performance, reduce dimensionality, and improve interpretability. Model performance was evaluated using 10-fold cross-validation based on accuracy, true positive rate (TPR), false negative rate (FNR), and AUC values. Results: Among all models and feature selection combinations, SVM with ANOVA-selected features achieved the highest classification accuracy (94.30%) and 100% TPR for the normal class. Feature sets based on traditional diagnostic regions (L1–L4, femoral neck, total femur) also showed high accuracy (up to 90.70%) but were generally outperformed by statistically selected features. CHI2 and MRMR methods also yielded robust results, particularly when paired with SVM and k-NN classifiers. The results highlight the effectiveness of combining statistical feature selection with ML to enhance diagnostic precision for osteoporosis and osteopenia. Conclusions: Machine learning algorithms, when integrated with data-driven feature selection strategies, provide a promising framework for automated classification of osteoporosis and osteopenia based on BMD data. ANOVA emerged as the most effective feature selection method, yielding superior accuracy across all classifiers. These findings support the integration of ML-based decision support tools into clinical workflows to facilitate early diagnosis and personalized treatment planning. Future studies should explore more diverse and larger datasets, incorporating genetic, lifestyle, and hormonal factors for further model enhancement. Full article

(This article belongs to the Section Orthopedics)

► Show Figures

Figure 1

26 pages, 2150 KB

Open AccessArticle

A Stability-Oriented Biomarker Selection Framework Synergistically Driven by Robust Rank Aggregation and L1-Sparse Modeling

by Jigen Luo, Jianqiang Du, Jia He, Qiang Huang, Zixuan Liu and Gaoxiang Huang

Metabolites 2025, 15(12), 806; https://doi.org/10.3390/metabo15120806 - 18 Dec 2025

Viewed by 300

Abstract

Background: In high-dimensional, small-sample omics studies such as metabolomics, feature selection not only determines the discriminative performance of classification models but also directly affects the reproducibility and translational value of candidate biomarkers. However, most existing methods primarily optimize classification accuracy and treat [...] Read more.

Background: In high-dimensional, small-sample omics studies such as metabolomics, feature selection not only determines the discriminative performance of classification models but also directly affects the reproducibility and translational value of candidate biomarkers. However, most existing methods primarily optimize classification accuracy and treat stability as a post hoc diagnostic, leading to considerable fluctuations in selected feature sets under different data splits or mild perturbations. Methods: To address this issue, this study proposes FRL-TSFS, a feature selection framework synergistically driven by filter-based Robust Rank Aggregation and L1-sparse modeling. Five complementary filter methods—variance thresholding, chi-square test, mutual information, ANOVA F test, and ReliefF—are first applied in parallel to score features, and Robust Rank Aggregation (RRA) is then used to obtain a consensus feature ranking that is less sensitive to the bias of any single scoring criterion. An L1-regularized logistic regression model is subsequently constructed on the candidate feature subset defined by the RRA ranking to achieve task-coupled sparse selection, thereby linking feature selection stability, feature compression, and classification performance. Results: FRL-TSFS was evaluated on six representative metabolomics and gene expression datasets under a mildly perturbed scenario induced by 10-fold cross-validation, and its performance was compared with multiple baselines using the Extended Kuncheva Index (EKI), Accuracy, and F1-score. The results show that RRA substantially improves ranking stability compared with conventional aggregation strategies without degrading classification performance, while the full FRL-TSFS framework consistently attains higher EKI values than the other feature selection schemes, markedly reduces the number of selected features to several tens of metabolites or genes, and maintains competitive classification performance. Conclusions: These findings indicate that FRL-TSFS can generate compact, reproducible, and interpretable biomarker panels, providing a practical analysis framework for stability-oriented feature selection and biomarker discovery in untargeted metabolomics. Full article

(This article belongs to the Special Issue Machine Learning in Metabolomics: Unlocking the Future of Data Analysis)

► Show Figures

Figure 1

15 pages, 2426 KB

Open AccessProceeding Paper

Scalable Machine Learning Solutions for High-Volume Financial Transaction Fraud Detection

by Sourav Yallur, Jiya Patil, Tanvi Shikhari, Prajwal Dabbanavar, Rajashri Khanai and Salma Shahpur

Comput. Sci. Math. Forum 2025, 12(1), 1; https://doi.org/10.3390/cmsf2025012001 - 17 Dec 2025

Viewed by 251

Abstract

More reliable and intelligent detection systems are required because of the rise in fraudulent activities brought on by the volume of digital financial transactions. In this work, the data used is from a publicly accessible dataset with more than a million transaction records [...] Read more.

More reliable and intelligent detection systems are required because of the rise in fraudulent activities brought on by the volume of digital financial transactions. In this work, the data used is from a publicly accessible dataset with more than a million transaction records to investigate a machine learning strategy to identify hidden patterns in the fraud transaction. Data preprocessing included applying Z-score normalization, eliminating outliers using the IQR method, and handling missing values according to the skewness of each attribute. The selection of important features was guided by correlation analysis using Chi-square tests and Pearson coefficients. This study implemented multiple supervised learning techniques, comprising Random Forest, Logistic Regression, K-Nearest Neighbors, and Gradient Boost to evaluate and compare their effectiveness in accurately detecting fraudulent transactions. Full article

(This article belongs to the Proceedings of First International Conference on Computational Intelligence and Soft Computing (CISCom 2025))

► Show Figures

Figure 1

19 pages, 1003 KB

Open AccessArticle

Diagnosis of Sarcoidosis Through Supervised Ensemble Method and GenAI-Based Data Augmentation: An Intelligent Diagnostic Tool

by Shwetha Rai, Adam Azman Abubakar, Roopashri Shetty, Gururaj Bijur, Nakul K. Shetty and Archana Praveen Kumar

Appl. Sci. 2025, 15(22), 12213; https://doi.org/10.3390/app152212213 - 18 Nov 2025

Viewed by 337

Abstract

Sarcoidosis, one of the rarest diseases, is challenging to diagnose as it mimics the symptoms of other diseases. Machine learning algorithms identify hidden patterns among symptoms, making them suitable for early diagnosis of Sarcoidosis. In this study, four ensemble models are developed using [...] Read more.

Sarcoidosis, one of the rarest diseases, is challenging to diagnose as it mimics the symptoms of other diseases. Machine learning algorithms identify hidden patterns among symptoms, making them suitable for early diagnosis of Sarcoidosis. In this study, four ensemble models are developed using baseline classifiers and applied to a symptom-based secondary dataset to explore the hidden information. The dataset comprises 189 patient records with 14 attributes: 2 serum markers, 10 symptoms, the patient’s gender, and 1 target variable. An exploratory data analysis is carried out using necessary preprocessing techniques, including missing value imputation and data scaling. The features are selected using PCA, and the relevance of the features is analyzed using the Chi-Square Test, Mutual Information, Sequential Feature Selection, and Tree-Based Selection methods. CTGAN, a GenAI technique, is used to augment the dataset, as it contains only 189 records. CTGAN preserves the clinical fidelity of all the features pertaining to the diagnosis of Sarcoidosis, ensuring synthetic data retains meaningful diagnostic patterns. The performance of the models developed is evaluated by applying them to both the original and synthetic data. Results demonstrate that proposed ensemble methods, Model Combinations 1, 3, and 4, showed 99.47% accuracy on the original dataset, whereas Model Combination 1 and Random Forest classifier showed 85.19% and 60.78% accuracies on a combination of the original with 81 synthetic and 1000 synthetic data, respectively. This highlights the combined advantage of CTGAN-based augmentation and ensemble learning in enhancing diagnostic modeling for rare diseases like Sarcoidosis where the datasets are available with limited data points. Full article

(This article belongs to the Special Issue Exploring AI: Methods and Applications for Data Mining)

► Show Figures

Figure 1

13 pages, 346 KB

Open AccessArticle

Social Determinants of Health Patterns in Children with Severe Disease Due to SARS-CoV-2 Infection—An Exploratory Approach

by Joshua Prabhu, Sebastian Acosta, Fabio Savorgnan, Ananth V. Annapragada and Usha Sethuraman

Children 2025, 12(11), 1515; https://doi.org/10.3390/children12111515 - 9 Nov 2025

Viewed by 498

Abstract

Background/Objectives: Research on the association of adverse social determinants of health (SDOH) with severe pediatric coronavirus disease (COVID-19) is limited. We examined associations between SDOH patterns and COVID-19 severity in children. Methods: We conducted a prospective, observational study of children (<18 years) with [...] Read more.

Background/Objectives: Research on the association of adverse social determinants of health (SDOH) with severe pediatric coronavirus disease (COVID-19) is limited. We examined associations between SDOH patterns and COVID-19 severity in children. Methods: We conducted a prospective, observational study of children (<18 years) with symptomatic SARS-CoV-2 infection evaluated in an urban pediatric emergency department (March 2021–April 2022) in Detroit, Michigan. Caregivers completed a 34-item survey based on the Healthy People 2030 framework. Severe disease was defined as the occurrence of respiratory/cardiac failure or death within four weeks of diagnosis. Continuous and categorical variables were described using medians and percentages, respectively. Associations between disease severity and risk factors were determined using chi-square tests. Association rule mining was used for feature selection, followed by multivariate logistic regression. Results: We analyzed data from 354 children [6–12 years: 31.1%, Female: 51.1%, Black: 59%, not Hispanic: 84.7%, public insurance: 77.1%, chronic condition: 27.4%]. Of the total, 113 children had severe disease. Most caregivers were 30–44 years old (53.1%), had less than a college degree (70.4%), and income < USD 50,000 (75.2%). Adverse SDOH reported included food/housing insecurity (24.6%), no support (64.7%), unmet childcare needs (35.9%), and lack of transportation (12.7%). After controlling for age, sex, medical history, income, and obesity, severe disease was associated with caregiver use of drugs/alcohol (OR:5.92, p < 0.001) and social discrimination/lack of support (OR: 1.74, p = 0.030). Conclusions: Two SDOH patterns (caregiver use of drugs/alcohol and social discrimination/lack of support) were associated with severe COVID-19. Further studies are needed to confirm findings and develop interventions. Full article

► Show Figures

Figure 1

21 pages, 2879 KB

Open AccessArticle

Prediction of Coal Calorific Value Based on Coal Quality-Derived Indicators and Support Vector Regression Method

by Xin Wang, Dahu Li, Youxiang Jiao, Yibin Yang and Zhao Cao

Energies 2025, 18(21), 5600; https://doi.org/10.3390/en18215600 - 24 Oct 2025

Cited by 1 | Viewed by 881

Abstract

This study addresses the limitations of traditional coal calorific value prediction models, which primarily rely on linear regression and single-source proximate analysis data. Based on 465 Chinese coal samples and integrating proximate analysis, ultimate analysis, and constructed derived indicators (combustible content—CC, carbon–hydrogen index—CHI, [...] Read more.

This study addresses the limitations of traditional coal calorific value prediction models, which primarily rely on linear regression and single-source proximate analysis data. Based on 465 Chinese coal samples and integrating proximate analysis, ultimate analysis, and constructed derived indicators (combustible content—CC, carbon–hydrogen index—CHI, carbon in combustibles—CIC), a nonlinear modeling method combining mean impact value (MIV) feature selection and support vector regression (SVR) is proposed. The results show that the Pearson correlation coefficients between the derived indicators and net calorific value (NCV) all exceed 0.93, outperforming the original items. Using CC–CHI–CIC–FC_ad as characteristic variables, the established SVR model achieved a mean absolute percentage error (MAPE), root mean square error (RMSE), and coefficient of determination (R²) of 1.838%, 0.544 MJ/kg, and 0.962, respectively, with exceptionally high statistical significance (F = 1485.96, p < 0.001). The predictive accuracy of this model is significantly superior to traditional linear models, while the proposed linear model based on the derived indicators (R² > 0.900) can serve as an alternative for rapid estimation. This method effectively enhances the accuracy and robustness of coal calorific value prediction. Full article

► Show Figures

Figure 1

20 pages, 3456 KB

Open AccessArticle

TWISS: A Hybrid Multi-Criteria and Wrapper-Based Feature Selection Method for EMG Pattern Recognition in Prosthetic Applications

by Aura Polo, Nelson Cárdenas-Bolaño, Lácides Antonio Ripoll Solano, Lely A. Luengas-Contreras and Carlos Robles-Algarín

Algorithms 2025, 18(10), 633; https://doi.org/10.3390/a18100633 - 8 Oct 2025

Viewed by 552

Abstract

This paper proposes TWISS (TOPSIS + Wrapper Incremental Subset Selection), a novel hybrid feature selection framework designed for electromyographic (EMG) pattern recognition in upper-limb prosthetic control. TWISS integrates the multi-criteria decision-making method TOPSIS with a forward wrapper search strategy, enabling subject-specific feature optimization [...] Read more.

This paper proposes TWISS (TOPSIS + Wrapper Incremental Subset Selection), a novel hybrid feature selection framework designed for electromyographic (EMG) pattern recognition in upper-limb prosthetic control. TWISS integrates the multi-criteria decision-making method TOPSIS with a forward wrapper search strategy, enabling subject-specific feature optimization based on a ranking that combines filter metrics, including Chi-squared, ANOVA, and Mutual Information. Unlike conventional static feature sets, such as the Hudgins configuration (48 features: four per channel, 12 channels) or All Features (192 features: 16 per channel, 12 channels), TWISS dynamically adapts feature subsets to each subject, addressing inter-subject variability and classification robustness challenges in EMG systems. The proposed algorithm was evaluated on the publicly available Ninapro DB7 dataset, comprising both intact and transradial amputee participants, and implemented in an open-source, fully reproducible environment. Two Google Colab tools were developed to support diverse workflows: one for end-to-end feature extraction and selection, and another for selection on precomputed feature sets. Experimental results demonstrated that TWISS achieved a median F1-macro score of 0.6614 with Logistic Regression, outperforming the All Features set (0.6536) and significantly surpassing the Hudgins set (0.5626) while reducing feature dimensionality. TWISS offers a scalable and computationally efficient solution for feature selection in biomedical signal processing and beyond, promoting the development of personalized, low-cost prosthetic control systems and other resource-constrained applications. Full article

► Show Figures

Graphical abstract

22 pages, 2526 KB

Open AccessArticle

An Explainable Deep Learning Framework with Adaptive Feature Selection for Smart Lemon Disease Classification in Agriculture

by Naeem Ullah, Michelina Ruocco, Antonio Della Cioppa, Ivanoe De Falco and Giovanna Sannino

Electronics 2025, 14(19), 3928; https://doi.org/10.3390/electronics14193928 - 2 Oct 2025

Cited by 1 | Viewed by 1073

Abstract

Early and accurate detection of lemon disease is necessary for effective citrus crop management. Traditional approaches often lack refined diagnosis, necessitating more powerful solutions. The article introduces adaptive PSO-LemonNetX, a novel framework integrating a novel deep learning model, adaptive Particle Swarm Optimization (PSO)-based [...] Read more.

Early and accurate detection of lemon disease is necessary for effective citrus crop management. Traditional approaches often lack refined diagnosis, necessitating more powerful solutions. The article introduces adaptive PSO-LemonNetX, a novel framework integrating a novel deep learning model, adaptive Particle Swarm Optimization (PSO)-based feature selection, and explainable AI (XAI) using LIME. The approach improves the accuracy of classification while also enhancing the explainability of the model. Our end-to-end model obtained 97.01% testing and 98.55% validation accuracy. Performance was enhanced further with adaptive PSO and conventional classifiers—100% validation accuracy using Naive Bayes and 98.8% testing accuracy using Naive Bayes and an SVM. The suggested PSO-based feature selection performed better than ReliefF, Kruskal–Wallis, and Chi-squared approaches. Due to its lightweight design and good performance, this approach can be adapted for edge devices in IoT-enabled smart farms, contributing to sustainable and automated disease detection systems. These results show the potential of integrating deep learning, PSO, grid search, and XAI into smart agriculture workflows for enhancing agricultural disease detection and decision-making. Full article

(This article belongs to the Special Issue Image Processing and Pattern Recognition)

► Show Figures

Figure 1

14 pages, 879 KB

Open AccessArticle

Predicting Factors Associated with Extended Hospital Stay After Postoperative ICU Admission in Hip Fracture Patients Using Statistical and Machine Learning Methods: A Retrospective Single-Center Study

by Volkan Alparslan, Sibel Balcı, Ayetullah Gök, Can Aksu, Burak İnner, Sevim Cesur, Hadi Ufuk Yörükoğlu, Berkay Balcı, Pınar Kartal Köse, Veysel Emre Çelik, Serdar Demiröz and Alparslan Kuş

Healthcare 2025, 13(19), 2507; https://doi.org/10.3390/healthcare13192507 - 2 Oct 2025

Viewed by 856

Abstract

Background: Hip fractures are common in the elderly and often require ICU admission post-surgery due to high ASA scores and comorbidities. Length of hospital stay after ICU is a crucial indicator affecting patient recovery, complication rates, and healthcare costs. This study aimed to [...] Read more.

Background: Hip fractures are common in the elderly and often require ICU admission post-surgery due to high ASA scores and comorbidities. Length of hospital stay after ICU is a crucial indicator affecting patient recovery, complication rates, and healthcare costs. This study aimed to develop and validate a machine learning-based model to predict the factors associated with extended hospital stay (>7 days from surgery to discharge) in hip fracture patients requiring postoperative ICU care. The findings could help clinicians optimize ICU bed utilization and improve patient management strategies. Methods: In this retrospective single-centre cohort study conducted in a tertiary ICU in Turkey (2017–2024), 366 ICU-admitted hip fracture patients were analysed. Conventional statistical analyses were performed using SPSS 29, including Mann–Whitney U and chi-squared tests. To identify independent predictors associated with extended hospital stay, Least Absolute Shrinkage and Selection Operator (LASSO) regression was applied for variable selection, followed by multivariate binary logistic regression analysis. In addition, machine learning models (binary logistic regression, random forest (RF), extreme gradient boosting (XGBoost) and decision tree (DT)) were trained to predict the likelihood of extended hospital stay, defined as the total number of days from the date of surgery until hospital discharge, including both ICU and subsequent ward stay. Model performance was evaluated using AUROC, F1 score, accuracy, precision, recall, and Brier score. SHAP (SHapley Additive exPlanations) values were used to interpret feature contributions in the XGBoost model. Results: The XGBoost model showed the best performance, except for precision. The XGBoost model gave an AUROC of 0.80, precision of 0.67, recall of 0.92, F1 score of 0.78, accuracy of 0.71 and Brier score of 0.18. According to SHAP analysis, time from fracture to surgery, hypoalbuminaemia and ASA score were the variables that most affected the length of stay of hospitalisation. Conclusions: The developed machine learning model successfully classified hip fracture patients into short and extended hospital stay groups following postoperative intensive care. This classification model has the potential to aid in patient flow management, resource allocation, and clinical decision support. External validation will further strengthen its applicability across different settings. Full article

(This article belongs to the Topic Application of Biostatistics in Medical Sciences and Global Health)

► Show Figures

Figure 1

31 pages, 956 KB

Open AccessArticle

Environmental Awareness and Responsibility: A Machine Learning Analysis of Polish University Students

by Dorota Murzyn, Teresa Mroczek, Marta Czyżewska and Karolina Jezierska

Sustainability 2025, 17(19), 8577; https://doi.org/10.3390/su17198577 - 24 Sep 2025

Cited by 2 | Viewed by 1299

Abstract

This study explores the concept of environmental responsibility and assesses the attitudes and perceptions of young adults towards environmental challenges. Applying a hybrid approach based on feature selection, machine learning methods (classification and regression trees (CART) and recursive feature elimination (RFE)) and statistical [...] Read more.

This study explores the concept of environmental responsibility and assesses the attitudes and perceptions of young adults towards environmental challenges. Applying a hybrid approach based on feature selection, machine learning methods (classification and regression trees (CART) and recursive feature elimination (RFE)) and statistical methods (chi-squared tests), we analyzed survey data from 500 students across three universities. The results reveal that 82% of students rate their climate knowledge as moderate or good, while 92% perceive climate change as a serious threat. Women are more likely than men to report engagement in pro-environmental initiatives. Students’ environmental orientation weakens in the middle years of study but re-emerges in the final year, possibly reflecting greater maturity and a stronger sense of responsibility before graduation. The willingness to establish sustainable enterprises does not always correspond to a high level of knowledge or daily environmental practices. While undergraduates report high levels of climate awareness, they often fail to translate this into concrete actions, indicating a gap between knowledge, motivation, and practice. The insights from the research can inform environmental education strategies, institutional practices, and youth engagement programs within higher education. Full article

► Show Figures

Figure 1

17 pages, 1337 KB

Open AccessArticle

Research on Accident Type Prediction for New Energy Vehicles Based on the AS-Naive Bayes Algorithm

by Shubing Huang, Bingshan Hou, Xiaoxuan Yin, Chenchen Kong and Chongming Wang

World Electr. Veh. J. 2025, 16(9), 523; https://doi.org/10.3390/wevj16090523 - 16 Sep 2025

Viewed by 810

Abstract

Developing new energy vehicles (NEVs) is a key strategy for achieving low-carbon and sustainable transportation. However, as the number of NEVs increases, traffic accidents involving these vehicles have risen sharply. To explore the characteristics of NEV accident types, and assess the occurrence of [...] Read more.

Developing new energy vehicles (NEVs) is a key strategy for achieving low-carbon and sustainable transportation. However, as the number of NEVs increases, traffic accidents involving these vehicles have risen sharply. To explore the characteristics of NEV accident types, and assess the occurrence of different accident types, this study proposes an accident type analysis and prediction method based on a novel Naive Bayes algorithm integrating the additive smoothing and synthetic minority over-sampling technique (AS-Naive Bayes). First, typical accident data (such as scraping, collisions, run-overs, rollovers, and battery fires/explosions) are extracted from the traffic management platform. A statistical analysis is then conducted to assess the relationships between accident types and factors including road conditions, time, vehicle status, and driver behavior. Moreover, to reduce the influence of irrelevant factors, Chi-square testing and Mutual Information are used to select features strongly associated with accident types. After that, to address the challenges of limited sample size and imbalanced distribution of accident types, this study proposes an accident type prediction method based on the AS–Naive Bayes algorithm, which integrates the Synthetic Minority Over-sampling Technique (SMOTE) and additive smoothing. Finally, five-fold cross-validation results show that the proposed method achieves a prediction accuracy of 84.8%, outperforming Support Vector Machine (SVM, 74.1%) and Long Short-Term Memory (LSTM, 79.8%), and standard Naive Bayes models, demonstrating its effectiveness in accurately identifying NEV accident types. Full article

(This article belongs to the Section Vehicle and Transportation Systems)

► Show Figures

Figure 1

13 pages, 237 KB

Open AccessArticle

A Detailed Study of Infection Following Custom-Made Porous Hydroxyapatite Cranioplasty: Risk Factors and How to Possibly Avoid Device Explantation

by Francesca Carolina Mannella, Francesca Faedo, Johan Pallud, Salvatore Chibbaro, Marta Fumagalli, Giuseppe Danilo Norata, Ismail Zaed and Franco Servadei

J. Clin. Med. 2025, 14(18), 6443; https://doi.org/10.3390/jcm14186443 - 12 Sep 2025

Cited by 1 | Viewed by 1053

Abstract

Background/Objectives: Postoperative infection is a significant complication following cranioplasty procedures. This study aimed to assess infection risk factors and clinical outcomes in patients undergoing cranioplasty with custom-made porous hydroxyapatite (PHA) implants, with a particular focus on treatment strategies used to manage infections and [...] Read more.

Background/Objectives: Postoperative infection is a significant complication following cranioplasty procedures. This study aimed to assess infection risk factors and clinical outcomes in patients undergoing cranioplasty with custom-made porous hydroxyapatite (PHA) implants, with a particular focus on treatment strategies used to manage infections and avoid implant explantation. Methods: This retrospective multicenter analysis included 984 patients who underwent PHA cranioplasty as part of a post-market clinical follow-up. Clinical data included demographics, surgical characteristics, infection features, microbiological results, infection management strategies, and outcomes. Associations with infection risk and implant explantation were assessed using chi-square tests. Results: Seventy-six patients (7.7%) developed postoperative infections. Infection risk was significantly associated with second-line procedures (p = 0.011) and implant location (p = 0.037). Most infections were superficial (92.1%) and early-onset (≤2 months from the surgery, 61.9%), with Staphylococcus spp. as the predominant pathogens. Explantation occurred in 77.6% of infected cases. The infection management strategy—whether initial conservative treatment with antibiotics alone (n = 18 of which 11.1% explanted) or surgical reoperation (n = 58 of which 93.8% explanted)—along with surgical cleaning and local (in situ) antibiotic use alone, was significantly associated with explantation outcomes (all p < 0.001). Among 18 patients treated with systemic antibiotics alone, 88.9% retained their implants. Notably, all successful cases had received broad-spectrum antibiotics for at least 4 weeks. Local antibiotic therapy was administered in 13 patients; no explants occurred among those who also received prolonged systemic treatment. Pathogen type was not significantly associated with the risk of explantation. Conclusions: Prolonged systemic antibiotic therapy, especially when combined with local treatment, may allow implant retention in selected infections, supporting individualized, conservative management strategies. Full article

(This article belongs to the Section Clinical Neurology)

24 pages, 4431 KB

Open AccessArticle

Fault Classification in Power Transformers Using Dissolved Gas Analysis and Optimized Machine Learning Algorithms

by Vuyani M. N. Dladla and Bonginkosi A. Thango

Machines 2025, 13(8), 742; https://doi.org/10.3390/machines13080742 - 20 Aug 2025

Viewed by 1050

Abstract

Power transformers are critical assets in electrical power systems, yet their fault diagnosis often relies on conventional dissolved gas analysis (DGA) methods such as the Duval Pentagon and Triangle, Key Gas, and Rogers Ratio methods. Even though these methods are commonly used, they [...] Read more.

Power transformers are critical assets in electrical power systems, yet their fault diagnosis often relies on conventional dissolved gas analysis (DGA) methods such as the Duval Pentagon and Triangle, Key Gas, and Rogers Ratio methods. Even though these methods are commonly used, they present limitations in classification accuracy, concurrent fault identification, and manual sample handling. In this study, a framework of optimized machine learning algorithms that integrates Chi-squared statistical feature selection with Random Search hyperparameter optimization algorithms was developed to enhance transformer fault classification accuracy using DGA data, thereby addressing the limitations of conventional methods and improving diagnostic precision. Utilizing the R2024b MATLAB Classification Learner App, five optimized machine learning algorithms were trained and tested using 282 transformer oil samples with varying DGA gas concentrations obtained from industrial transformers, the IEC TC10 database, and the literature. The optimized and assessed models are Linear Discriminant, Naïve Bayes, Decision Trees, Support Vector Machine, Neural Networks, k-Nearest Neighbor, and the Ensemble Algorithm. From the proposed models, the best performing algorithm, Optimized k-Nearest Neighbor, achieved an overall performance accuracy of 92.478%, followed by the Optimized Neural Network at 89.823%. To assess their performance against the conventional methods, the same dataset used for the optimized machine learning algorithms was used to evaluate the performance of the Duval Triangle and Duval Pentagon methods using VAISALA DGA software version 1.1.0; the proposed models outperformed the conventional methods, which could only achieve a classification accuracy of 35.757% and 30.818%, respectively. This study concludes that the application of the proposed optimized machine learning algorithms can enhance the classification accuracy of DGA-based faults in power transformers, supporting more reliable diagnostics and proactive maintenance strategies. Full article

(This article belongs to the Section Electrical Machines and Drives)

► Show Figures

Figure 1

27 pages, 582 KB

Open AccessArticle

An Empirical Evaluation of Ensemble Models for Python Code Smell Detection

by Rajwant Singh Rao, Seema Dewangan and Alok Mishra

Appl. Sci. 2025, 15(13), 7472; https://doi.org/10.3390/app15137472 - 3 Jul 2025

Cited by 2 | Viewed by 1750

Abstract

Code smells, which represent poor design choices or suboptimal code implementations, reduce software quality and hinder the code maintenance process. Detecting code smells is, therefore, essential during software development. This study introduces a Python-based code smell dataset targeting two smell types: Large Class [...] Read more.

Code smells, which represent poor design choices or suboptimal code implementations, reduce software quality and hinder the code maintenance process. Detecting code smells is, therefore, essential during software development. This study introduces a Python-based code smell dataset targeting two smell types: Large Class and Long Method. Five ensemble learning methods—Bagging, Gradient Boost, Max Voting, AdaBoost, and XGBoost—were employed to detect code smells within these datasets. The ten most significant features were selected using the Chi-square feature selection technique. To address the class imbalance, the SMOTE algorithm was applied. Experimental results yielded a best accuracy score of 0.96 and an MCC of 0.85 for the Large Class dataset using the Max Voting model. For the Long Method dataset, a best accuracy score of 0.98 and an MCC of 0.94 were achieved using the Gradient Boost model in conjunction with Chi-square feature selection. These results highlight the effectiveness of the proposed methodology and its potential to enhance code smell detection in Python significantly, reinforcing confidence in the approach’s thoroughness and applicability. Full article

(This article belongs to the Special Issue Intelligent Software Engineering: Innovations, Challenges, and Applications)

► Show Figures

Figure 1

Search Results (155)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (155)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI