MDPI - Publisher of Open Access Journals

42 pages, 6154 KB

Open AccessFeature PaperArticle

A Novel Hybrid Opcode Feature Selection Framework for Efficient and Effective IoT Malware Detection

by Bakhan Tofiq Ahmed, Noor Ghazi M. Jameel and Bakhtiar Ibrahim Saeed

IoT 2026, 7(1), 24; https://doi.org/10.3390/iot7010024 - 2 Mar 2026

Viewed by 542

Malware’s proliferation in the Internet of Things (IoT) ecosystem requires precise, efficient detection systems capable of operating on IoT devices. Existing static analysis approaches often fail due to computational inefficiency stemming from high feature dimensionality inherent in raw opcode features. This research addresses [...] Read more.

Malware’s proliferation in the Internet of Things (IoT) ecosystem requires precise, efficient detection systems capable of operating on IoT devices. Existing static analysis approaches often fail due to computational inefficiency stemming from high feature dimensionality inherent in raw opcode features. This research addresses this limitation by proposing a novel machine-learning (ML)-driven Intelligent Hybrid Feature Selection (IHFS) framework with two distinct architectures. IHFS1 combines a filter method (variance threshold) with an embedded method (LGBM feature importance). Conversely, IHFS2 integrates variance thresholding with a wrapper method (Recursive Feature Elimination with Cross-Validation using LGBM) for optimal selection. This framework is specifically designed to select an optimally stable and minimal feature subset from the initial 1183 opcode frequency vector extracted from ARM binaries. Applying this framework to a multi-family IoT malware dataset, the IHFS architectures yielded distinct and highly efficient feature subsets: IHFS1 achieved a 95.77% reduction (to 50 features), while IHFS2 attained a 98.06% reduction (to 23 features). Evaluation across eight ML models confirmed that the Random Forest (with IHFS1 subset) and Decision Tree (with IHFS2 subset) classifiers were the best performing, achieving robust classification metrics that outperform current state-of-the-art solutions. The Decision Tree model demonstrated exceptional detection capabilities, with an accuracy of 99.87%, a precision of 99.82%, a recall of 99.88%, and an F1-score of 99.85%. It achieved an average inference time of 0.058 ms per sample. Experimental results attained on a native ARM64 environment validate the deployment feasibility of the proposed system for resource-constrained IoT devices, such as the Raspberry Pi. The proposed system achieves a high-throughput, low-overhead security posture while maintaining host operational stability, processing a single ELF binary in just 3.431 ms. Full article

(This article belongs to the Special Issue Cybersecurity in the Age of the Internet of Things)

► Show Figures

Figure 1

24 pages, 2125 KB

Open AccessArticle

MIC-SSO: A Two-Stage Hybrid Feature Selection Approach for Tabular Data

by Wei-Chang Yeh, Yunzhi Jiang, Hsin-Jung Hsu and Chia-Ling Huang

Electronics 2026, 15(4), 856; https://doi.org/10.3390/electronics15040856 - 18 Feb 2026

Viewed by 361

Abstract

High-dimensional structured datasets are common in fields such as semiconductor manufacturing, healthcare, and finance, where redundant and irrelevant features often increase computational cost and reduce predictive accuracy. Feature selection mitigates these issues by identifying a compact, informative subset of features, enhancing model efficiency, [...] Read more.

High-dimensional structured datasets are common in fields such as semiconductor manufacturing, healthcare, and finance, where redundant and irrelevant features often increase computational cost and reduce predictive accuracy. Feature selection mitigates these issues by identifying a compact, informative subset of features, enhancing model efficiency, performance, and interpretability. This study proposes Maximal Information Coefficient–Simplified Swarm Optimization (MIC-SSO), a two-stage hybrid feature selection method that combines the MIC as a filter with SSO as a wrapper. In Stage 1, MIC ranks feature relevance and removes low-contribution features; in Stage 2, SSO searches for an optimal subset from the reduced feature space using a fitness function that integrates the Matthews Correlation Coefficient (MCC) and feature reduction rate to balance accuracy and compactness. Experiments on five public datasets compare MIC-SSO with multiple hybrid, heuristic, and literature-reported methods, with results showing superior predictive accuracy and feature compression. The method’s ability to outperform existing approaches in terms of predictive accuracy and feature compression underscores its broader significance, offering a powerful tool for data analysis in fields like healthcare, finance, and semiconductor manufacturing. Statistical tests further confirm significant improvements over competing approaches, demonstrating the method’s effectiveness in integrating the efficiency of filters with the precision of wrappers for high-dimensional tabular data analysis. Full article

(This article belongs to the Special Issue Feature Papers in Networks: 2025–2026 Edition)

► Show Figures

Figure 1

21 pages, 1305 KB

Open AccessArticle

Cross-Learner Spectral Subset Optimisation: PLS–Ensemble Feature Selection with Weighted Borda Count for Grapevine Cultivar Discrimination

by Kyle Loggenberg, Albert Strever and Zahn Münch

Geomatics 2026, 6(1), 12; https://doi.org/10.3390/geomatics6010012 - 28 Jan 2026

Viewed by 421

Abstract

The mapping of vineyard cultivars presents a substantial challenge in digital agriculture due to the crop’s high intra-class heterogeneity and low inter-class variability. High-dimensional spectral datasets, such as hyperspectral or spectrometry data, can overcome these difficulties. However, research has yet to fully address [...] Read more.

The mapping of vineyard cultivars presents a substantial challenge in digital agriculture due to the crop’s high intra-class heterogeneity and low inter-class variability. High-dimensional spectral datasets, such as hyperspectral or spectrometry data, can overcome these difficulties. However, research has yet to fully address the need for optimal spectral feature subsets tailored for grapevine cultivar discrimination, while few studies have systematically examined waveband subsets that transfer effectively across different learning algorithms. This study sets out to address these gaps by introducing a Partial Least Squares (PLS)-based ensemble feature selection framework with Weighted Borda Count aggregation for cultivar discrimination. Using in-field spectrometry data, collected for six cultivars, and 18 PLS-based feature selection methods spanning filter, wrapper, and hybrid approaches, the PLS–ensemble identified 100 wavebands most relevant for cultivar discrimination, reducing dimensionality by ~95%. The efficacy and transferability of this subset were evaluated using five classification algorithms: Oblique Random Forest (oRF), Multinomial Logistic Regression (Multinom), Support Vector Machine (SVM), Multi-Layer Perceptron (MLP), and a 1D Convolutional Neural Network (CNN). For oRF, Multinom, SVM, and MLP, the PLS–ensemble subset improved accuracy by 0.3–12% compared with using all wavebands. The subset was not optimal for the 1D-CNN, where accuracy decreased by up to 5.7%. Additionally, this study investigated waveband binning to transform narrow hyperspectral bands into broadband spectral features. Using feature multicollinearity and wavelength position, the 100 selected wavebands were condensed into 10 broadband features, which improved accuracy over both the full dataset and the original subset, delivering gains of 4.5–19.1%. The SVM model with this 10-feature subset outperformed all other models (F1: 1.00; BACC: 0.98; MCC: 0.78; AUC: 0.95). Full article

► Show Figures

Figure 1

14 pages, 399 KB

Open AccessArticle

LAFS: A Fast, Differentiable Approach to Feature Selection Using Learnable Attention

by Hıncal Topçuoğlu, Atıf Evren, Elif Tuna and Erhan Ustaoğlu

Entropy 2026, 28(1), 20; https://doi.org/10.3390/e28010020 - 24 Dec 2025

Viewed by 773

Abstract

Feature selection is a critical preprocessing step for mitigating the curse of dimensionality in machine learning. Existing methods present a difficult trade-off: filter methods are fast but often suboptimal as they evaluate features in isolation, while wrapper methods are powerful but computationally prohibitive [...] Read more.

Feature selection is a critical preprocessing step for mitigating the curse of dimensionality in machine learning. Existing methods present a difficult trade-off: filter methods are fast but often suboptimal as they evaluate features in isolation, while wrapper methods are powerful but computationally prohibitive due to their iterative nature. In this paper, we propose LAFS (Learnable Attention for Feature Selection), a novel, end-to-end differentiable framework that achieves the performance of wrapper methods at the speed of simpler models. LAFS employs a neural attention mechanism to learn a context-aware importance score for all features simultaneously in a single forward pass. To encourage the selection of a sparse and non-redundant feature subset, we introduce a novel hybrid loss function that combines the standard classification objective with an information-theoretic entropic regularizer on the attention weights. We validate our approach on real-world high-dimensional benchmark datasets. Our experiments demonstrate that LAFS successfully identifies complex feature interactions and handles multicollinearity. In general comparison, LAFS achieves very close and accurate results to state-of-the-art RFE-LGBM and embedded FSA methods. Our work establishes a new point on the accuracy-efficiency frontier, demonstrating that attention-based architectures provide a compatible solution to the feature selection problem. Full article

(This article belongs to the Special Issue Information-Theoretic Methods in Data Analytics, 2nd Edition)

► Show Figures

Figure 1

34 pages, 1960 KB

Open AccessArticle

Quantum-Inspired Hybrid Metaheuristic Feature Selection with SHAP for Optimized and Explainable Spam Detection

by Qusai Shambour, Mahran Al-Zyoud and Omar Almomani

Symmetry 2025, 17(10), 1716; https://doi.org/10.3390/sym17101716 - 13 Oct 2025

Cited by 7 | Viewed by 1398

Abstract

The rapid growth of digital communication has intensified spam-related threats, including phishing and malware, which employ advanced evasion tactics. Traditional filtering methods struggle to keep pace, driving the need for sophisticated machine learning (ML) solutions. The effectiveness of ML models hinges on selecting [...] Read more.

The rapid growth of digital communication has intensified spam-related threats, including phishing and malware, which employ advanced evasion tactics. Traditional filtering methods struggle to keep pace, driving the need for sophisticated machine learning (ML) solutions. The effectiveness of ML models hinges on selecting high-quality input features, especially in high-dimensional datasets where irrelevant or redundant attributes impair performance and computational efficiency. Guided by principles of symmetry to achieve an optimal balance between model accuracy, complexity, and interpretability, this study proposes an Enhanced Hybrid Quantum-Inspired Firefly and Artificial Bee Colony (EHQ-FABC) algorithm for feature selection in spam detection. EHQ-FABC leverages the Firefly Algorithm’s local exploitation and the Artificial Bee Colony’s global exploration, augmented with quantum-inspired principles to maintain search space diversity and a symmetrical balance between exploration and exploitation. It eliminates redundant attributes while preserving predictive power. For interpretability, Shapley Additive Explanations (SHAPs) are employed to ensure symmetry in explanation, meaning features with equal contributions are assigned equal importance, providing a fair and consistent interpretation of the model’s decisions. Evaluated on the ISCX-URL2016 dataset, EHQ-FABC reduces features by over 76%, retaining only 17 of 72 features, while matching or outperforming filter, wrapper, embedded, and metaheuristic methods. Tested across ML classifiers like CatBoost, XGBoost, Random Forest, Extra Trees, Decision Tree, K-Nearest Neighbors, Logistic Regression, and Multi-Layer Perceptron, EHQ-FABC achieves a peak accuracy of 99.97% with CatBoost and robust results across tree ensembles, neural, and linear models. SHAP analysis highlights features like domain_token_count and NumberOfDotsinURL as key for spam detection, offering actionable insights for practitioners. EHQ-FABC provides a reliable, transparent, and efficient symmetry-aware solution, advancing both accuracy and explainability in spam detection. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

20 pages, 3456 KB

Open AccessArticle

TWISS: A Hybrid Multi-Criteria and Wrapper-Based Feature Selection Method for EMG Pattern Recognition in Prosthetic Applications

by Aura Polo, Nelson Cárdenas-Bolaño, Lácides Antonio Ripoll Solano, Lely A. Luengas-Contreras and Carlos Robles-Algarín

Algorithms 2025, 18(10), 633; https://doi.org/10.3390/a18100633 - 8 Oct 2025

Cited by 1 | Viewed by 735

Abstract

This paper proposes TWISS (TOPSIS + Wrapper Incremental Subset Selection), a novel hybrid feature selection framework designed for electromyographic (EMG) pattern recognition in upper-limb prosthetic control. TWISS integrates the multi-criteria decision-making method TOPSIS with a forward wrapper search strategy, enabling subject-specific feature optimization [...] Read more.

This paper proposes TWISS (TOPSIS + Wrapper Incremental Subset Selection), a novel hybrid feature selection framework designed for electromyographic (EMG) pattern recognition in upper-limb prosthetic control. TWISS integrates the multi-criteria decision-making method TOPSIS with a forward wrapper search strategy, enabling subject-specific feature optimization based on a ranking that combines filter metrics, including Chi-squared, ANOVA, and Mutual Information. Unlike conventional static feature sets, such as the Hudgins configuration (48 features: four per channel, 12 channels) or All Features (192 features: 16 per channel, 12 channels), TWISS dynamically adapts feature subsets to each subject, addressing inter-subject variability and classification robustness challenges in EMG systems. The proposed algorithm was evaluated on the publicly available Ninapro DB7 dataset, comprising both intact and transradial amputee participants, and implemented in an open-source, fully reproducible environment. Two Google Colab tools were developed to support diverse workflows: one for end-to-end feature extraction and selection, and another for selection on precomputed feature sets. Experimental results demonstrated that TWISS achieved a median F1-macro score of 0.6614 with Logistic Regression, outperforming the All Features set (0.6536) and significantly surpassing the Hudgins set (0.5626) while reducing feature dimensionality. TWISS offers a scalable and computationally efficient solution for feature selection in biomedical signal processing and beyond, promoting the development of personalized, low-cost prosthetic control systems and other resource-constrained applications. Full article

► Show Figures

Graphical abstract

20 pages, 914 KB

Open AccessArticle

Cost-Efficient Hybrid Filter-Based Parameter Selection Scheme for Intrusion Detection System in IoT

by Gabriel Chukwunonso Amaizu, Akshita Maradapu Vera Venkata Sai, Madhuri Siddula and Dong-Seong Kim

Electronics 2025, 14(4), 726; https://doi.org/10.3390/electronics14040726 - 13 Feb 2025

Viewed by 1104

Abstract

The rapid growth of Internet of Things (IoT) devices has brought about significant advancements in automation, data collection, and connectivity across various domains. However, this increased interconnectedness also poses substantial security challenges, making IoT networks attractive targets for malicious actors. Intrusion detection systems [...] Read more.

The rapid growth of Internet of Things (IoT) devices has brought about significant advancements in automation, data collection, and connectivity across various domains. However, this increased interconnectedness also poses substantial security challenges, making IoT networks attractive targets for malicious actors. Intrusion detection systems (IDSs) play a vital role in protecting IoT environments from cyber threats, necessitating the development of sophisticated and effective NIDS solutions. This paper proposes an IDS that addresses the curse of dimensionality by eliminating redundant and highly correlated features, followed by a wrapper-based feature ranking to determine their importance. Additionally, the IDS incorporates cutting-edge image processing techniques to reconstruct data into images, which are further enhanced through a filtering process. Finally, a meta classifier, consisting of three base models, is employed for efficient and accurate intrusion detection. Simulation results using industry-standard datasets demonstrate that the hybrid parameter selection approach significantly reduces computational costs while maintaining reliability. Furthermore, the combination of image transformation and ensemble learning techniques achieves higher detection accuracy, further enhancing the effectiveness of the proposed IDS. Full article

(This article belongs to the Special Issue New Challenges in Cyber Security)

► Show Figures

Figure 1

17 pages, 3022 KB

Open AccessArticle

An Optimized Hybrid Approach for Feature Selection Based on Chi-Square and Particle Swarm Optimization Algorithms

by Amani Abdo, Rasha Mostafa and Laila Abdel-Hamid

Data 2024, 9(2), 20; https://doi.org/10.3390/data9020020 - 25 Jan 2024

Cited by 25 | Viewed by 5451

Abstract

Feature selection is a significant issue in the machine learning process. Most datasets include features that are not needed for the problem being studied. These irrelevant features reduce both the efficiency and accuracy of the algorithm. It is possible to think about feature [...] Read more.

Feature selection is a significant issue in the machine learning process. Most datasets include features that are not needed for the problem being studied. These irrelevant features reduce both the efficiency and accuracy of the algorithm. It is possible to think about feature selection as an optimization problem. Swarm intelligence algorithms are promising techniques for solving this problem. This research paper presents a hybrid approach for tackling the problem of feature selection. A filter method (chi-square) and two wrapper swarm intelligence algorithms (grey wolf optimization (GWO) and particle swarm optimization (PSO)) are used in two different techniques to improve feature selection accuracy and system execution time. The performance of the two phases of the proposed approach is assessed using two distinct datasets. The results show that PSOGWO yields a maximum accuracy boost of 95.3%, while chi2-PSOGWO yields a maximum accuracy improvement of 95.961% for feature selection. The experimental results show that the proposed approach performs better than the compared approaches. Full article

(This article belongs to the Section Information Systems and Data Management)

► Show Figures

Figure 1

22 pages, 5551 KB

Open AccessArticle

A Multivariate Time Series Analysis of Electrical Load Forecasting Based on a Hybrid Feature Selection Approach and Explainable Deep Learning

by Fatma Yaprakdal and Merve Varol Arısoy

Appl. Sci. 2023, 13(23), 12946; https://doi.org/10.3390/app132312946 - 4 Dec 2023

Cited by 31 | Viewed by 6271

Abstract

In the smart grid paradigm, precise electrical load forecasting (ELF) offers significant advantages for enhancing grid reliability and informing energy planning decisions. Specifically, mid-term ELF is a key priority for power system planning and operation. Although statistical methods were primarily used because ELF [...] Read more.

In the smart grid paradigm, precise electrical load forecasting (ELF) offers significant advantages for enhancing grid reliability and informing energy planning decisions. Specifically, mid-term ELF is a key priority for power system planning and operation. Although statistical methods were primarily used because ELF is a time series problem, deep learning (DL)-based forecasting approaches are more commonly employed and successful in achieving precise predictions. However, these DL-based techniques, known as black box models, lack interpretability. When interpreting the DL model, employing explainable artificial intelligence (XAI) yields significant advantages by extracting meaningful information from the DL model outputs and the causal relationships among various factors. On the contrary, precise load forecasting necessitates employing feature engineering to identify pertinent input features and determine optimal time lags. This research study strives to accomplish a mid-term forecast of ELF study load utilizing aggregated electrical load consumption data, while considering the aforementioned critical aspects. A hybrid framework for feature selection and extraction is proposed for electric load forecasting. Technical term abbreviations are explained upon first use. The feature selection phase employs a combination of filter, Pearson correlation (PC), embedded random forest regressor (RFR) and decision tree regressor (DTR) methods to determine the correlation and significance of each feature. In the feature extraction phase, we utilized a wrapper-based technique called recursive feature elimination cross-validation (RFECV) to eliminate redundant features. Multi-step-ahead time series forecasting is conducted utilizing three distinct long-short term memory (LSTM) models: basic LSTM, bi-directional LSTM (Bi-LSTM) and attention-based LSTM models to accurately predict electrical load consumption thirty days in advance. Through numerous studies, a reduction in forecasting errors of nearly 50% has been attained. Additionally, the local interpretable model-agnostic explanations (LIME) methodology, which is an explainable artificial intelligence (XAI) technique, is utilized for explaining the mid-term ELF model. As far as the authors are aware, XAI has not yet been implemented in mid-term aggregated energy forecasting studies utilizing the ELF method. Quantitative and detailed evaluations have been conducted, with the experimental results indicating that this comprehensive approach is entirely successful in forecasting multivariate mid-term loads. Full article

► Show Figures

Figure 1

14 pages, 470 KB

Open AccessArticle

Genetic Algorithm for High-Dimensional Emotion Recognition from Speech Signals

by Liya Yue, Pei Hu, Shu-Chuan Chu and Jeng-Shyang Pan

Electronics 2023, 12(23), 4779; https://doi.org/10.3390/electronics12234779 - 25 Nov 2023

Cited by 3 | Viewed by 1934

Abstract

Feature selection plays a crucial role in establishing an effective speech emotion recognition system. To improve recognition accuracy, people always extract as many features as possible from speech signals. However, this may reduce efficiency. We propose a hybrid filter–wrapper feature selection based on [...] Read more.

Feature selection plays a crucial role in establishing an effective speech emotion recognition system. To improve recognition accuracy, people always extract as many features as possible from speech signals. However, this may reduce efficiency. We propose a hybrid filter–wrapper feature selection based on a genetic algorithm specifically designed for high-dimensional (HGA) speech emotion recognition. The algorithm first utilizes Fisher Score and information gain to comprehensively rank acoustic features, and then these features are assigned probabilities for inclusion in subsequent operations according to their ranking. HGA improves population diversity and local search ability by modifying the initial population generation method of genetic algorithm (GA) and introducing adaptive crossover and a new mutation strategy. The proposed algorithm clearly reduces the number of selected features in four common English speech emotion datasets. It is confirmed by K-nearest neighbor and random forest classifiers that it is superior to state-of-the-art algorithms in accuracy, precision, recall, and F1-Score. Full article

(This article belongs to the Special Issue Evolutionary Computation Methods for Real-World Problem Solving)

► Show Figures

Figure 1

17 pages, 438 KB

Open AccessArticle

A Machine Learning Method with Hybrid Feature Selection for Improved Credit Card Fraud Detection

by Ibomoiye Domor Mienye and Yanxia Sun

Appl. Sci. 2023, 13(12), 7254; https://doi.org/10.3390/app13127254 - 18 Jun 2023

Cited by 49 | Viewed by 8053

Abstract

With the rapid developments in electronic commerce and digital payment technologies, credit card transactions have increased significantly. Machine learning (ML) has been vital in analyzing customer data to detect and prevent fraud. However, the presence of redundant and irrelevant features in most real-world [...] Read more.

With the rapid developments in electronic commerce and digital payment technologies, credit card transactions have increased significantly. Machine learning (ML) has been vital in analyzing customer data to detect and prevent fraud. However, the presence of redundant and irrelevant features in most real-world credit card data degrades the performance of ML classifiers. This study proposes a hybrid feature-selection technique consisting of filter and wrapper feature-selection steps to ensure that only the most relevant features are used for machine learning. The proposed method uses the information gain (IG) technique to rank the features, and the top-ranked features are fed to a genetic algorithm (GA) wrapper, which uses the extreme learning machine (ELM) as the learning algorithm. Meanwhile, the proposed GA wrapper is optimized for imbalanced classification using the geometric mean (G-mean) as the fitness function instead of the conventional accuracy metric. The proposed approach achieved a sensitivity and specificity of 0.997 and 0.994, respectively, outperforming other baseline techniques and methods in the recent literature. Full article

► Show Figures

Figure 1

19 pages, 3317 KB

Open AccessArticle

Selecting Features That Influence Vehicle Collisions in the Internet of Vehicles Based on a Multi-Objective Hybrid Bi-Directional NSGA-III

by Mubarak S. Almutairi, Khalid Almutairi and Haruna Chiroma

Appl. Sci. 2023, 13(4), 2064; https://doi.org/10.3390/app13042064 - 5 Feb 2023

Cited by 3 | Viewed by 2204

Abstract

The smart platform of generating, collecting, managing and processing dynamic data from different sources in the Internet of Vehicles (IoV) pave the way for a large-scale dataset to be accumulated. The dataset can contain records running into hundreds of thousands and even millions [...] Read more.

The smart platform of generating, collecting, managing and processing dynamic data from different sources in the Internet of Vehicles (IoV) pave the way for a large-scale dataset to be accumulated. The dataset can contain records running into hundreds of thousands and even millions of relevant, irrelevant and redundant features. Therefore, feature selection to select only the significant features for developing vehicle collision detection alarm systems for deployment in the IoV edge is critical. However, previous studies on vehicle collision detection in the IoV have not conducted rigorous feature selection. Limited studies have mainly applied Pearson correlation coefficient (PCC) to select subset features influencing vehicle collision in the domain of IoV. However, PCC can cause relevant features to be discarded if the correlation of the non-linear association is too small, thereby providing incorrect feature ranking, which, in turn, increases the chances of developing a model that will give a poor outcome. To close this gap, this paper proposed a multi-objective, filter-based hybrid non-dominated sorted genetic algorithm III with a gain ratio and bi-directional wrapper for the selection of subset features influencing vehicle collision in the IoV. The proposed approach selected the minimal most significant subset features for developing a vehicle collision detection classifier with maximum accuracy for deployment in the IoV. A comparative study proves that the proposed approach performs better than the compared algorithms across hybrid-, wrapper- and filter-based feature selection methods within the family of the NSGA. Further, a comparative analysis with other evolutionary algorithms proves the superiority of the proposal. This study can help researchers in the future by avoiding the use of large-scale computing resources in acquiring data to develop vehicle collision alert systems in the IoV. This can be achieved since only the subset features discovered in this study are collected, as opposed to collecting large features, thus saving time and resources in the subsequent vehicle collision detection data collection in the IoV. Full article

(This article belongs to the Special Issue Evolutionary Algorithms and Large-Scale Real-World Applications)

► Show Figures

Figure 1

21 pages, 8316 KB

Open AccessArticle

A Novel Acoustic Method for Cavitation Identification of Propeller

by Yang Li and Lilin Cui

J. Mar. Sci. Eng. 2022, 10(9), 1225; https://doi.org/10.3390/jmse10091225 - 1 Sep 2022

Cited by 7 | Viewed by 3415

Abstract

When a propeller is under a state of cavitation, it will experience negative effects, including strong noise, vibration, and even damage to the blades. Accordingly, the detection of propeller cavitation has attracted the attention of researchers. Propeller noise signal contains a wealth of [...] Read more.

When a propeller is under a state of cavitation, it will experience negative effects, including strong noise, vibration, and even damage to the blades. Accordingly, the detection of propeller cavitation has attracted the attention of researchers. Propeller noise signal contains a wealth of cavitation information, which is a powerful method to identify the cavitation state. Considering the nonlinear characteristics of propeller noise, a feature describing the complexity of nonlinear signals, which is called refined composite multiscale fluctuation-based dispersion entropy (RCMFDE), is adopted as the indicator of propeller cavitation, and a framework for the identification of propeller cavitation state is established in this paper. Firstly, the propeller noise signal is decomposed by the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) method, and the intrinsic mode function (IMF) components with cavitation characteristics are extracted. Secondly, the RCMFDE of the IMF components is computed. Finally, a hybrid optimization support vector machine (SVM) is established to classify the features, in which a Relief-F filter is utilized to reduce the feature dimension, and a particle swarm optimization (PSO) wrapper is utilized to optimize the parameters of the SVM. The experimental results demonstrate encouraging accuracy to apply this approach to identify the propeller cavitation states, with an identification accuracy of 91.11%. Full article

(This article belongs to the Special Issue Application of Sensing and Machine Learning to Underwater Acoustic)

► Show Figures

Figure 1

31 pages, 1727 KB

Open AccessArticle

Feature Selection Using Artificial Gorilla Troop Optimization for Biomedical Data: A Case Analysis with COVID-19 Data

by Jayashree Piri, Puspanjali Mohapatra, Biswaranjan Acharya, Farhad Soleimanian Gharehchopogh, Vassilis C. Gerogiannis, Andreas Kanavos and Stella Manika

Mathematics 2022, 10(15), 2742; https://doi.org/10.3390/math10152742 - 3 Aug 2022

Cited by 78 | Viewed by 4242

Abstract

Feature selection (FS) is commonly thought of as a pre-processing strategy for determining the best subset of characteristics from a given collection of features. Here, a novel discrete artificial gorilla troop optimization (DAGTO) technique is introduced for the first time to handle FS [...] Read more.

Feature selection (FS) is commonly thought of as a pre-processing strategy for determining the best subset of characteristics from a given collection of features. Here, a novel discrete artificial gorilla troop optimization (DAGTO) technique is introduced for the first time to handle FS tasks in the healthcare sector. Depending on the number and type of objective functions, four variants of the proposed method are implemented in this article, namely: (1) single-objective (SO-DAGTO), (2) bi-objective (wrapper) (MO-DAGTO1), (3) bi-objective (filter wrapper hybrid) (MO-DAGTO2), and (4) tri-objective (filter wrapper hybrid) (MO-DAGTO3) for identifying relevant features in diagnosing a particular disease. We provide an outstanding gorilla initialization strategy based on the label mutual information (MI) with the aim of increasing population variety and accelerate convergence. To verify the performance of the presented methods, ten medical datasets are taken into consideration, which are of variable dimensions. A comparison is also implemented between the best of the four suggested approaches (MO-DAGTO2) and four established multi-objective FS strategies, and it is statistically proven to be the superior one. Finally, a case study with COVID-19 samples is performed to extract the critical factors related to it and to demonstrate how this method is fruitful in real-world applications. Full article

(This article belongs to the Special Issue Advanced Optimization Methods and Applications)

► Show Figures

Figure 1

22 pages, 5949 KB

Open AccessEditor’s ChoiceArticle

Choosing Feature Selection Methods for Spatial Modeling of Soil Fertility Properties at the Field Scale

by Caner Ferhatoglu and Bradley A. Miller

Agronomy 2022, 12(8), 1786; https://doi.org/10.3390/agronomy12081786 - 29 Jul 2022

Cited by 13 | Viewed by 3883

Abstract

With the growing availability of environmental covariates, feature selection (FS) is becoming an essential task for applying machine learning (ML) in digital soil mapping (DSM). In this study, the effectiveness of six types of FS methods from four categories (filter, wrapper, embedded, and [...] Read more.

With the growing availability of environmental covariates, feature selection (FS) is becoming an essential task for applying machine learning (ML) in digital soil mapping (DSM). In this study, the effectiveness of six types of FS methods from four categories (filter, wrapper, embedded, and hybrid) were compared. These FS algorithms chose relevant covariates from an exhaustive set of 1049 environmental covariates for predicting five soil fertility properties in ten fields, in combination with ten different ML algorithms. Resulting model performance was compared by three different metrics (R² of 10-fold cross validation (CV), robustness ratio (RR; developed in this study), and independent validation with Lin’s concordance correlation coefficient (IV-CCC)). FS improved CV, RR, and IV-CCC compared to the models built without FS for most fields and soil properties. Wrapper (BorutaShap) and embedded (Lasso-FS, Random forest-FS) methods usually led to the optimal models. The filter-based ANOVA-FS method mostly led to overfit models, especially for fields with smaller sample quantities. Decision-tree based models were usually part of the optimal combination of FS and ML. Considering RR helped identify optimal combinations of FS and ML that can improve the performance of DSM compared to models produced from full covariate stacks. Full article

(This article belongs to the Special Issue Geostatistics and Machine Learning in the Mapping of Agricultural Soils: State-of-the-Art and Perspectives)

► Show Figures

Figure 1

Search Results (26)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (26)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI