Algorithms for Feature Selection (2nd Edition)

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Algorithms for Multidisciplinary Applications".

Deadline for manuscript submissions: closed (30 September 2024) | Viewed by 19823

Special Issue Editor


E-Mail Website
Guest Editor
Department of Software, Faculty of Artificial Intelligence and Software, Gachon University, Seongnam 13120, Republic of Korea
Interests: algorithms; computational intelligence and its applications
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In recent years, feature selection has been acknowledged as one of the significant activity research fields due to the obvious emergence of datasets comprising large numbers of features. As a result, feature selection was considered an excellent technique for both improving the modeling of the underlying data generation process and lowering the cost of obtaining the features. Additionally, from a machine learning perspective, because feature selection may shrink the complexity of an issue, it can be utilized to preserve or even boost the effectiveness of algorithms while minimizing computing costs. Recently, the emergence of Big Data has created new hurdles for machine learning researchers, who must now handle vast amounts of data, both in terms of instances and characteristics, rendering the learning process more complicated and computationally intensive than ever. While engaging with a significant number of features, the efficiency of learning algorithms might degrade due to overfitting; as learned models become increasingly complicated, their interpretability decreases, and the performance and efficacy of the algorithms are affected. Unfortunately, some of the most widely used algorithms were designed when dataset sizes were considerably smaller, and therefore do not scale well in the wake of these developments. Thus, it is necessary to repurpose these effective methods to address Big Data concerns.

For this Special Issue, we seek papers concerning current advances in feature selection algorithms for high-dimensional settings, as well as review papers that will motivate ongoing efforts to grasp the challenges commonly faced in this field. High-quality articles that address both theoretical and practical challenges relating to feature selection algorithms are welcome.

Dr. Muhammad Adnan Khan
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • algorithms and techniques for feature selection based on evolutionary search
  • ensemble methods for feature selection
  • feature selection for high dimensional data
  • feature selection for time series data
  • feature selection applications
  • feature selection for textual data
  • deep feature selection

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Related Special Issue

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research

6 pages, 182 KiB  
Editorial
Special Issue “Algorithms for Feature Selection (2nd Edition)”
by Muhammad Adnan Khan
Algorithms 2025, 18(1), 16; https://doi.org/10.3390/a18010016 - 3 Jan 2025
Viewed by 877
Abstract
This Special Issue focuses on advancing research on algorithms, with a particular emphasis on feature selection techniques [...] Full article
(This article belongs to the Special Issue Algorithms for Feature Selection (2nd Edition))

Research

Jump to: Editorial

21 pages, 623 KiB  
Article
Attribute Relevance Score: A Novel Measure for Identifying Attribute Importance
by Pablo Neirz, Hector Allende and Carolina Saavedra
Algorithms 2024, 17(11), 518; https://doi.org/10.3390/a17110518 - 9 Nov 2024
Viewed by 1626
Abstract
This study introduces a novel measure for evaluating attribute relevance, specifically designed to accurately identify attributes that are intrinsically related to a phenomenon, while being sensitive to the asymmetry of those relationships and noise conditions. Traditional variable selection techniques, such as filter and [...] Read more.
This study introduces a novel measure for evaluating attribute relevance, specifically designed to accurately identify attributes that are intrinsically related to a phenomenon, while being sensitive to the asymmetry of those relationships and noise conditions. Traditional variable selection techniques, such as filter and wrapper methods, often fall short in capturing these complexities. Our methodology, grounded in decision trees but extendable to other machine learning models, was rigorously evaluated across various data scenarios. The results demonstrate that our measure effectively distinguishes relevant from irrelevant attributes and highlights how relevance is influenced by noise, providing a more nuanced understanding compared to established methods such as Pearson, Spearman, Kendall, MIC, MAS, MEV, GMIC, and Phik. This research underscores the importance of phenomenon-centric explainability, reproducibility, and robust attribute relevance evaluation in the development of predictive models. By enhancing both the interpretability and contextual accuracy of models, our approach not only supports more informed decision making but also contributes to a deeper understanding of the underlying mechanisms in diverse application domains, such as biomedical research, financial modeling, astronomy, and others. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection (2nd Edition))
Show Figures

Figure 1

21 pages, 8936 KiB  
Article
A Proposal for a New Python Library Implementing Stepwise Procedure
by Luiz Paulo Fávero, Helder Prado Santos, Patrícia Belfiore, Alexandre Duarte, Igor Pinheiro de Araújo Costa, Adilson Vilarinho Terra, Miguel Ângelo Lellis Moreira, Wilson Tarantin Junior and Marcos dos Santos
Algorithms 2024, 17(11), 502; https://doi.org/10.3390/a17110502 - 4 Nov 2024
Cited by 1 | Viewed by 980
Abstract
Carefully selecting variables in problems with large volumes of data are extremely important, as it reduces the complexity of the model, improves the interpretation of the results, and increases computational efficiency, ensuring more accurate and relevant analyses. This paper presents a comprehensive approach [...] Read more.
Carefully selecting variables in problems with large volumes of data are extremely important, as it reduces the complexity of the model, improves the interpretation of the results, and increases computational efficiency, ensuring more accurate and relevant analyses. This paper presents a comprehensive approach to selecting variables in multiple regression models using the stepwise procedure. As the main contribution of this study, we present the stepwise function implemented in Python to improve the effectiveness of statistical analyses, allowing the intuitive and efficient selection of statistically significant variables. The application of the function is exemplified in a real case study of real estate pricing, validating its effectiveness in improving the fit of regression models. In addition, we presented a methodological framework for treating joint problems in data analysis, such as heteroskedasticity, multicollinearity, and nonadherence of residues to normality. This framework offers a robust computational implementation to mitigate such issues. This study aims to advance the understanding and application of statistical methods in Python, providing valuable tools for researchers, students, and professionals from various areas. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection (2nd Edition))
Show Figures

Figure 1

15 pages, 1321 KiB  
Article
ACT-FRCNN: Progress Towards Transformer-Based Object Detection
by Sukana Zulfqar, Zenab Elgamal, Muhammad Azam Zia, Abdul Razzaq, Sami Ullah and Hussain Dawood
Algorithms 2024, 17(11), 475; https://doi.org/10.3390/a17110475 - 23 Oct 2024
Cited by 1 | Viewed by 1354
Abstract
Maintaining a high input resolution is crucial for more complex tasks like detection or segmentation to ensure that models can adequately identify and reflect fine details in the output. This study aims to reduce the computation costs associated with high-resolution input by using [...] Read more.
Maintaining a high input resolution is crucial for more complex tasks like detection or segmentation to ensure that models can adequately identify and reflect fine details in the output. This study aims to reduce the computation costs associated with high-resolution input by using a variant of transformer, known as the Adaptive Clustering Transformer (ACT). The proposed model is named ACT-FRCNN. Which integrates ACT with a Faster Region-Based Convolution Neural Network (FRCNN) for a detection task head. In this paper, we proposed a method to improve the detection framework, resulting in better performance for out-of-domain images, improved object identification, and reduced dependence on non-maximum suppression. The ACT-FRCNN represents a significant step in the application of transformer models to challenging visual tasks like object detection, laying the foundation for future work using transformer models. The performance of ACT-FRCNN was evaluated on a variety of well-known datasets including BSDS500, NYUDv2, and COCO. The results indicate that ACT-FRCNN reduces over-detection errors and improves the detection of large objects. The findings from this research have practical implications for object detection and other computer vision tasks. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection (2nd Edition))
Show Figures

Figure 1

18 pages, 5058 KiB  
Article
Measuring Student Engagement through Behavioral and Emotional Features Using Deep-Learning Models
by Nasir Mahmood, Sohail Masood Bhatti, Hussain Dawood, Manas Ranjan Pradhan and Haseeb Ahmad
Algorithms 2024, 17(10), 458; https://doi.org/10.3390/a17100458 - 16 Oct 2024
Cited by 1 | Viewed by 3039
Abstract
Students’ behavioral and emotional engagement in the classroom environment may reflect the students’ learning experience and subsequent educational outcomes. The existing research has overlooked the measurement of behavioral and emotional engagement in an offline classroom environment with more students, and it has not [...] Read more.
Students’ behavioral and emotional engagement in the classroom environment may reflect the students’ learning experience and subsequent educational outcomes. The existing research has overlooked the measurement of behavioral and emotional engagement in an offline classroom environment with more students, and it has not measured the student engagement level in an objective sense. This work aims to address the limitations of the existing research and presents an effective approach to measure students’ behavioral and emotional engagement and the student engagement level in an offline classroom environment during a lecture. More precisely, video data of 100 students during lectures in different offline classes were recorded and pre-processed to extract frames with individual students. For classification, convolutional-neural-network- and transfer-learning-based models including ResNet50, VGG16, and Inception V3 were trained, validated, and tested. First, behavioral engagement was computed using salient features, for which the self-trained CNN classifier outperformed with a 97%, 91%, and 83% training, validation, and testing accuracy, respectively. Subsequently, the emotional engagement of the behaviorally engaged students was computed, for which the ResNet50 model surpassed the others with a 95%, 90%, and 82% training, validation, and testing accuracy, respectively. Finally, a novel student engagement level metric is proposed that incorporates behavioral and emotional engagement. The proposed approach may provide support for improving students’ learning in an offline classroom environment and devising effective pedagogical policies. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection (2nd Edition))
Show Figures

Figure 1

28 pages, 6310 KiB  
Article
Integrating Eye Movement, Finger Pressure, and Foot Pressure Information to Build an Intelligent Driving Fatigue Detection System
by Jong-Chen Chen and Yin-Zhen Chen
Algorithms 2024, 17(9), 402; https://doi.org/10.3390/a17090402 - 8 Sep 2024
Viewed by 1401
Abstract
Fatigued driving is a problem that every driver will face, and traffic accidents caused by drowsy driving often occur involuntarily. If there is a fatigue detection and warning system, it is generally believed that the occurrence of some incidents can be reduced. However, [...] Read more.
Fatigued driving is a problem that every driver will face, and traffic accidents caused by drowsy driving often occur involuntarily. If there is a fatigue detection and warning system, it is generally believed that the occurrence of some incidents can be reduced. However, everyone’s driving habits and methods may differ, so it is not easy to establish a suitable general detection system. If a customized intelligent fatigue detection system can be established, it may reduce unfortunate accidents. With its potential to mitigate unfortunate accidents, this study offers hope for a safer driving environment. Thus, on the one hand, this research hopes to integrate the information obtained from three different sensing devices (eye movement, finger pressure, and plantar pressure), which are chosen for their ability to provide comprehensive and reliable data on a driver’s physical and mental state. On the other hand, it uses an autonomous learning architecture to integrate these three data types to build a customized fatigued driving detection system. This study used a system that simulated a car driving environment and then invited subjects to conduct tests on fixed driving routes. First, we demonstrated that the system established in this study could be used to learn and classify different driving clips. Then, we showed that it was possible to judge whether the driver was fatigued through a series of driving behaviors, such as lane drifting, sudden braking, and irregular acceleration, rather than a single momentary behavior. Finally, we tested the hypothesized situation in which drivers were experiencing three cases of different distractions. The results show that the entire system can establish a personal driving system through autonomous learning behavior and further detect whether fatigued driving abnormalities occur. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection (2nd Edition))
Show Figures

Figure 1

38 pages, 2613 KiB  
Article
Optimization of Gene Selection for Cancer Classification in High-Dimensional Data Using an Improved African Vultures Algorithm
by Mona G. Gafar, Amr A. Abohany, Ahmed E. Elkhouli and Amr A. Abd El-Mageed
Algorithms 2024, 17(8), 342; https://doi.org/10.3390/a17080342 - 6 Aug 2024
Cited by 2 | Viewed by 1326
Abstract
This study presents a novel method, termed RBAVO-DE (Relief Binary African Vultures Optimization based on Differential Evolution), aimed at addressing the Gene Selection (GS) challenge in high-dimensional RNA-Seq data, specifically the rnaseqv2 lluminaHiSeq rnaseqv2 un edu Level 3 RSEM genes normalized dataset, which [...] Read more.
This study presents a novel method, termed RBAVO-DE (Relief Binary African Vultures Optimization based on Differential Evolution), aimed at addressing the Gene Selection (GS) challenge in high-dimensional RNA-Seq data, specifically the rnaseqv2 lluminaHiSeq rnaseqv2 un edu Level 3 RSEM genes normalized dataset, which contains over 20,000 genes. RNA Sequencing (RNA-Seq) is a transformative approach that enables the comprehensive quantification and characterization of gene expressions, surpassing the capabilities of micro-array technologies by offering a more detailed view of RNA-Seq gene expression data. Quantitative gene expression analysis can be pivotal in identifying genes that differentiate normal from malignant tissues. However, managing these high-dimensional dense matrix data presents significant challenges. The RBAVO-DE algorithm is designed to meticulously select the most informative genes from a dataset comprising more than 20,000 genes and assess their relevance across twenty-two cancer datasets. To determine the effectiveness of the selected genes, this study employs the Support Vector Machine (SVM) and k-Nearest Neighbor (k-NN) classifiers. Compared to binary versions of widely recognized meta-heuristic algorithms, RBAVO-DE demonstrates superior performance. According to Wilcoxon’s rank-sum test, with a 5% significance level, RBAVO-DE achieves up to 100% classification accuracy and reduces the feature size by up to 98% in most of the twenty-two cancer datasets examined. This advancement underscores the potential of RBAVO-DE to enhance the precision of gene selection for cancer research, thereby facilitating more accurate and efficient identification of key genetic markers. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection (2nd Edition))
Show Figures

Figure 1

16 pages, 5093 KiB  
Article
New Multi-View Feature Learning Method for Accurate Antifungal Peptide Detection
by Sayeda Muntaha Ferdous, Shafayat Bin Shabbir Mugdha and Iman Dehzangi
Algorithms 2024, 17(6), 247; https://doi.org/10.3390/a17060247 - 6 Jun 2024
Cited by 2 | Viewed by 1718
Abstract
Antimicrobial resistance, particularly the emergence of resistant strains in fungal pathogens, has become a pressing global health concern. Antifungal peptides (AFPs) have shown great potential as a promising alternative therapeutic strategy due to their inherent antimicrobial properties and potential application in combating fungal [...] Read more.
Antimicrobial resistance, particularly the emergence of resistant strains in fungal pathogens, has become a pressing global health concern. Antifungal peptides (AFPs) have shown great potential as a promising alternative therapeutic strategy due to their inherent antimicrobial properties and potential application in combating fungal infections. However, the identification of antifungal peptides using experimental approaches is time-consuming and costly. Hence, there is a demand to propose fast and accurate computational approaches to identifying AFPs. This paper introduces a novel multi-view feature learning (MVFL) model, called AFP-MVFL, for accurate AFP identification, utilizing multi-view feature learning. By integrating the sequential and physicochemical properties of amino acids and employing a multi-view approach, the AFP-MVFL model significantly enhances prediction accuracy. It achieves 97.9%, 98.4%, 0.98, and 0.96 in terms of accuracy, precision, F1 score, and Matthews correlation coefficient (MCC), respectively, outperforming previous studies found in the literature. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection (2nd Edition))
Show Figures

Figure 1

24 pages, 2990 KiB  
Article
A Comparative Study of Machine Learning Methods and Text Features for Text Authorship Recognition in the Example of Azerbaijani Language Texts
by Rustam Azimov and Efthimios Providas
Algorithms 2024, 17(6), 242; https://doi.org/10.3390/a17060242 - 5 Jun 2024
Viewed by 1391
Abstract
This paper presents various machine learning methods with different text features that are explored and evaluated to determine the authorship of the texts in the example of the Azerbaijani language. We consider techniques like artificial neural network, convolutional neural network, random forest, and [...] Read more.
This paper presents various machine learning methods with different text features that are explored and evaluated to determine the authorship of the texts in the example of the Azerbaijani language. We consider techniques like artificial neural network, convolutional neural network, random forest, and support vector machine. These techniques are used with different text features like word length, sentence length, combined word length and sentence length, n-grams, and word frequencies. The models were trained and tested on the works of many famous Azerbaijani writers. The results of computer experiments obtained by utilizing a comparison of various techniques and text features were analyzed. The cases where the usage of text features allowed better results were determined. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection (2nd Edition))
Show Figures

Figure 1

16 pages, 3410 KiB  
Article
Feature Extraction Based on Sparse Coding Approach for Hand Grasp Type Classification
by Jirayu Samkunta, Patinya Ketthong, Nghia Thi Mai, Md Abdus Samad Kamal, Iwanori Murakami and Kou Yamada
Algorithms 2024, 17(6), 240; https://doi.org/10.3390/a17060240 - 3 Jun 2024
Viewed by 1040
Abstract
The kinematics of the human hand exhibit complex and diverse characteristics unique to each individual. Various techniques such as vision-based, ultrasonic-based, and data-glove-based approaches have been employed to analyze human hand movements. However, a critical challenge remains in efficiently analyzing and classifying hand [...] Read more.
The kinematics of the human hand exhibit complex and diverse characteristics unique to each individual. Various techniques such as vision-based, ultrasonic-based, and data-glove-based approaches have been employed to analyze human hand movements. However, a critical challenge remains in efficiently analyzing and classifying hand grasp types based on time-series kinematic data. In this paper, we propose a novel sparse coding feature extraction technique based on dictionary learning to address this challenge. Our method enhances model accuracy, reduces training time, and minimizes overfitting risk. We benchmarked our approach against principal component analysis (PCA) and sparse coding based on a Gaussian random dictionary. Our results demonstrate a significant improvement in classification accuracy: achieving 81.78% with our method compared to 31.43% for PCA and 77.27% for the Gaussian random dictionary. Furthermore, our technique outperforms in terms of macro-average F1-score and average area under the curve (AUC) while also significantly reducing the number of features required. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection (2nd Edition))
Show Figures

Figure 1

21 pages, 440 KiB  
Article
Assessing the Ability of Genetic Programming for Feature Selection in Constructing Dispatching Rules for Unrelated Machine Environments
by Marko Đurasević, Domagoj Jakobović, Stjepan Picek and Luca Mariot
Algorithms 2024, 17(2), 67; https://doi.org/10.3390/a17020067 - 4 Feb 2024
Cited by 1 | Viewed by 2728
Abstract
The automated design of dispatching rules (DRs) with genetic programming (GP) has become an important research direction in recent years. One of the most important decisions in applying GP to generate DRs is determining the features of the scheduling problem to be used [...] Read more.
The automated design of dispatching rules (DRs) with genetic programming (GP) has become an important research direction in recent years. One of the most important decisions in applying GP to generate DRs is determining the features of the scheduling problem to be used during the evolution process. Unfortunately, there are no clear rules or guidelines for the design or selection of such features, and often the features are simply defined without investigating their influence on the performance of the algorithm. However, the performance of GP can depend significantly on the features provided to it, and a poor or inadequate selection of features for a given problem can result in the algorithm performing poorly. In this study, we examine in detail the features that GP should use when developing DRs for unrelated machine scheduling problems. Different types of features are investigated, and the best combination of these features is determined using two selection methods. The obtained results show that the design and selection of appropriate features are crucial for GP, as they improve the results by about 7% when only the simplest terminal nodes are used without selection. In addition, the results show that it is not possible to outperform more sophisticated manually designed DRs when only the simplest problem features are used as terminal nodes. This shows how important it is to design appropriate composite terminal nodes to produce high-quality DRs. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection (2nd Edition))
Show Figures

Figure 1

Back to TopTop