Next Article in Journal
Hybrid Stochastic–Machine Learning Framework for Postprandial Glucose Prediction in Type 1 Diabetes
Next Article in Special Issue
Net Rural Migration Classification in Colombia Using Supervised Decision Tree Algorithms
Previous Article in Journal
Binary Differential Evolution with a Limited Maximum Number of Dimension Changes
Previous Article in Special Issue
Predicting the Magnitude of Earthquakes Using Grammatical Evolution
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Feature Selection Method Based on Simultaneous Perturbation Stochastic Approximation Technique Evaluated on Cancer Genome Data Classification

by
Satya Dev Pasupuleti
and
Simone A. Ludwig
*
Department of Computer Science, North Dakota State University, Fargo, ND 58105, USA
*
Author to whom correspondence should be addressed.
Algorithms 2025, 18(10), 622; https://doi.org/10.3390/a18100622
Submission received: 13 August 2025 / Revised: 18 September 2025 / Accepted: 29 September 2025 / Published: 1 October 2025
(This article belongs to the Special Issue Algorithms in Data Classification (3rd Edition))

Abstract

Cancer classification using high-dimensional genomic data presents significant challenges in feature selection, particularly when dealing with datasets containing tens of thousands of features. This study presents a new application of the Simultaneous Perturbation Stochastic Approximation (SPSA) method for feature selection on large-scale cancer datasets, representing the first investigation of the SPSA-based feature selection technique applied to cancer datasets of this magnitude. Our research extends beyond traditional SPSA applications, which have historically been limited to smaller datasets, by evaluating its effectiveness on datasets containing 35,924 to 44,894 features. Building upon established feature-ranking methodologies, we introduce a comprehensive evaluation framework that examines the impact of varying proportions of top-ranked features (5%, 10%, and 15%) on classification performance. This systematic approach enables the identification of optimal feature subsets most relevant to cancer detection across different selection thresholds. The key contributions of this work include the following: (1) the first application of SPSA-based feature selection to large-scale cancer datasets exceeding 35,000 features, (2) an evaluation methodology examining multiple feature proportion thresholds to optimize classification performance, (3) comprehensive experimental validation through comparison with ten state-of-the-art feature selection and classification methods, and (4) statistical significance testing to quantify the improvements achieved by the SPSA approach over benchmark methods. Our experimental evaluation demonstrates the effectiveness of the feature selection and ranking-based SPSA method in handling high-dimensional cancer data, providing insights into optimal feature selection strategies for genomic classification tasks.
Keywords: high dimensional data; classification models; SPSA; feature selection; machine learning; cancer genomics high dimensional data; classification models; SPSA; feature selection; machine learning; cancer genomics
Graphical Abstract

Share and Cite

MDPI and ACS Style

Pasupuleti, S.D.; Ludwig, S.A. Feature Selection Method Based on Simultaneous Perturbation Stochastic Approximation Technique Evaluated on Cancer Genome Data Classification. Algorithms 2025, 18, 622. https://doi.org/10.3390/a18100622

AMA Style

Pasupuleti SD, Ludwig SA. Feature Selection Method Based on Simultaneous Perturbation Stochastic Approximation Technique Evaluated on Cancer Genome Data Classification. Algorithms. 2025; 18(10):622. https://doi.org/10.3390/a18100622

Chicago/Turabian Style

Pasupuleti, Satya Dev, and Simone A. Ludwig. 2025. "Feature Selection Method Based on Simultaneous Perturbation Stochastic Approximation Technique Evaluated on Cancer Genome Data Classification" Algorithms 18, no. 10: 622. https://doi.org/10.3390/a18100622

APA Style

Pasupuleti, S. D., & Ludwig, S. A. (2025). Feature Selection Method Based on Simultaneous Perturbation Stochastic Approximation Technique Evaluated on Cancer Genome Data Classification. Algorithms, 18(10), 622. https://doi.org/10.3390/a18100622

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop