Abstract
Online parameter tuning significantly enhances the performance of optimization algorithms by dynamically adjusting mutation and crossover rates. However, current approaches often suffer from high computational costs and limited adaptability to complex and dynamic fitness landscapes, particularly when machine learning methods are employed. This work proposes a quantized shallow neural network (SNN) as an efficient learning-based component for dynamically adjusting the mutation and crossover rates of a genetic algorithm (GA). By leveraging runtime-generated data and applying quantization techniques like Quantization-aware Training (QaT) and Post-training Quantization (PtQ), the proposed approach reduces computational overhead while maintaining competitive performance. Experimental evaluation on 15 continuous benchmark functions demonstrates that the quantized SNN achieves high-quality solutions while significantly reducing execution time compared to alternative shallow learning methods. This study highlights the potential of quantized SNNs to balance efficiency and performance, broadening the applicability of shallow learning in optimization.
1. Introduction
Genetic algorithms (GAs) are versatile metaheuristics widely employed to solve complex optimization problems across domains such as engineering design, logistics, and hyperparameter tuning [,]. Their effectiveness relies on balancing exploration and exploitation through genetic operators like mutation and crossover. However, GA performance is highly sensitive to the manual configuration of these operators’ rates, which often leads to suboptimal solutions in dynamic or high-dimensional search spaces []. Traditional static parameter settings fail to adapt to evolving problem landscapes, necessitating labor-intensive expert intervention. This limitation underscores the urgency for autonomous strategies that dynamically tune parameters during runtime.
Online parameter tuning has emerged as a promising paradigm to reduce reliance on manual expertise. By leveraging real-time data from the optimization process, these strategies adjust mutation () and crossover () rates to enhance solution quality []. Recent studies integrate machine learning (ML) components, such as shallow neural networks (SNNs) and support vector regression, to predict optimal parameters iteratively [,]. While these approaches improve adaptability, they often incur high computational costs or lack scalability in large-scale scenarios []. Furthermore, existing methods struggle to maintain efficiency in dynamic environments where rapid parameter adaptation is critical [].
The dynamic adjustment of mutation and crossover rates is critical for GAs to achieve robust optimization. Four key reasons justify this necessity:
- Population Diversity Maintenance: Crossover combines genetic material from parents, preserving diversity to avoid homogeneity [,]. Without effective crossoverexploration stagnates, trapping solutions in local optima [].
- Search Space Exploration: Mutation introduces random perturbations, enabling the discovery of unexplored regions []. This prevents premature convergence and mitigates the risk of local optima [].
- Premature Convergence Prevention: Rapid population convergence to similar solutions necessitates dynamic rates to reintroduce diversity []. Adaptive and counteract stagnation by balancing exploitation and exploration [].
- Adaptation to Complex Fitness Landscapes: Non-convex, multimodal landscapes require continuous parameter adaptation to navigate shifting optima []. Studies advocate adaptive probabilities to enhance GA resilience in such scenarios [].
This dynamic equilibrium ensures effective exploration–exploitation trade-offs, particularly in applications like hyperparameter optimization [] and real-time systems [].
Despite these benefits, dynamic optimization environments pose significant challenges. High-dimensional and non-linear landscapes create vast search spaces that complicate the identification of global optima []. Furthermore, the dynamic nature of these problems—where constraints or objectives vary over time—demands real-time adaptability []. Machine learning-based tuning often introduces additional computational overhead, increasing runtime costs and impeding real-time deployment []. Another major concern lies in ensuring generalization and robustness, as noisy data and operational variability can degrade model reliability []. Moreover, the sensitivity of hyperparameters makes manual tuning of machine learning components impractical, particularly in large-scale scenarios []. Finally, maintaining an effective balance between exploration and exploitation remains a persistent issue, often leading to premature convergence or insufficient exploitation in dynamic regimes []. These interconnected challenges highlight the need for efficient and scalable machine learning-integrated strategies that can ensure robust genetic algorithm performance [].
While machine learning techniques offer promising avenues for dynamic parameter tuning, their integration into GAs introduces notable challenges. Many existing ML components, particularly deep neural networks, incur substantial computational overhead due to complex architectures and high-precision computations [,]. Shallow learning alternatives, though more efficient, often struggle to generalize across diverse fitness landscapes or require frequent retraining, undermining scalability in large-scale optimization [,]. For instance, support vector regression (SVR) and basic SNNs exhibit degraded performance in high-dimensional spaces due to their sensitivity to hyperparameter settings [,]. Additionally, real-time deployment in dynamic environments demands not only accuracy but also rapid inference speeds, a requirement unmet by conventional floating-point models []. These limitations highlight the need for lightweight, adaptive ML frameworks that harmonize efficiency with robustness.
To address these challenges, this work introduces quantized shallow neural networks (SNNs) for online parameter tuning in GAs. By integrating Quantization-aware Training (QaT) and Post-training Quantization (PtQ), the proposed SNNs reduce memory usage by 75% and inference latency by 40% compared to traditional 32-bit models [,]. Unlike deep learning frameworks, SNNs leverage streamlined architectures with fewer layers, enabling faster retraining cycles without sacrificing predictive accuracy []. This approach dynamically adjusts mutation and crossover rates using runtime-generated data, ensuring adaptability to evolving search landscapes while minimizing computational costs []. The quantization process further enhances hardware compatibility, making the framework suitable for resource-constrained environments such as edge devices []. By bridging metaheuristics with efficient ML, this study advances hybrid optimization methodologies, offering a scalable solution for real-world applications like logistics and energy systems [,].
This paper evaluates the proposed framework on 15 continuous benchmark functions, including multimodal and non-convex landscapes such as Rosenbrock, Schwefel, and Ackley. The results demonstrate superior performance in solution quality, stability, and efficiency compared to SVR and non-quantized SNNs. The remainder of the paper is structured as follows: Section 2 details the methodology, including the GA workflow, SNN architecture, and quantization techniques. Section 3 presents experimental results and comparative analyses, while Section 4 discusses implications and future research directions. By prioritizing computational efficiency without compromising adaptability, this work expands the applicability of shallow learning in evolutionary computation, fostering robust optimization systems for dynamic and large-scale problems [,].
2. Methodology
The proposed methodology integrates a quantized shallow neural network (SNN) into a genetic algorithm (GA) in order to dynamically adjust mutation and crossover rates during optimization. Unlike traditional GAs with static or heuristically decayed parameters, this approach introduces an adaptive control mechanism, where the SNN acts as a surrogate decision-maker trained online. The framework operates cyclically, alternating between GA execution, runtime data collection, periodic SNN retraining, and adaptive parameter prediction.
The workflow is evaluated on a set of 15 continuous benchmark functions of varying modality, separability, and conditioning, including well-established cases such as Sphere, Rosenbrock, Schwefel, and Ackley (see Section 2.2). This diverse set of test functions allows us to demonstrate the generalizability of the adaptive parameter control strategy.
2.1. Algorithm Workflow
The algorithm consists of four interconnected stages, each contributing to the balance between exploration and exploitation. An overview of the proposed workflow is depicted in Figure 1, which outlines the interaction between the four stages.
Figure 1.
Workflow of the proposed adaptive GA-SNN framework. The process integrates genetic algorithm execution, runtime data collection, SNN retraining with quantization, and parameter integration.
2.1.1. Genetic Algorithm Execution
- The GA initializes a population of size N within predefined search bounds. Each individual is evaluated against the objective function .
- Parent selection is performed via tournament selection of size k, a choice justified by its balance between selective pressure and population diversity preservation. For sufficiently large N, the expected takeover time of the fittest individual under tournament selection is approximately , which ensures a logarithmic growth rate in convergence speed while avoiding premature stagnation.
- Recombination is carried out using arithmetic crossover, while Gaussian mutation perturbs individual genes with dynamically adjusted rates (mutation) and (crossover). The mutation operator guarantees ergodicity of the search process: for any feasible solution , there exists a nonzero probability that repeated Gaussian mutations will eventually generate it.
- Fitness values, population diversity metrics, and applied parameter rates are logged at each generation, producing a rich temporal dataset for subsequent learning.
2.1.2. Runtime Data Collection
- A sliding window buffer stores descriptors from the last generations. Features include the following:
- –
- Normalized fitness dynamics, such as moving averages and relative improvement rates, which approximate the slope of the fitness landscape.
- –
- Population diversity metrics (standard deviation of fitness values and mean Euclidean distance between individuals). These metrics are crucial since low variance correlates with a higher risk of premature convergence.
- –
- Historical values of and , capturing how parameter settings influence future search dynamics.
- The supervised learning target is the optimal parameter pair . This is computed retrospectively by evaluating which parameter values in the last generations maximized the relative fitness improvement . In expectation, this transforms the problem into a regression of the form:which formalizes the adaptive search as an online learning problem with delayed rewards.
2.1.3. Periodic SNN Retraining
- Every generations, the SNN is retrained on the accumulated dataset. The network architecture comprises the following:
- –
- Input layer: 3 nodes (current , , and normalized best fitness).
- –
- Hidden layer: 32 neurons with ReLU activation, chosen via grid search. The choice of ReLU over sigmoid/tanh follows from its ability to reduce vanishing gradient issues and better approximate piecewise-linear mappings in dynamic systems.
- –
- Output layer: 2 neurons for and , mapped through sigmoid activations to ensure bounded outputs.
- The training objective is to minimize the mean squared error:This corresponds to a least-squares regression on the parameter landscape. Convergence of stochastic gradient descent guarantees that , provided the learning rate is sufficiently small.
- To reduce overhead, quantization is employed:
- –
- Quantization-aware Training (QaT) simulates 8-bit operations during training, ensuring the learned representations are robust to reduced precision.
- –
- Post-training Quantization (PtQ) compresses the model, reducing storage requirements by 75% while preserving predictive accuracy within 1–2%.
From a computational complexity perspective, quantization reduces matrix multiplication cost by a factor of 4, which is critical for frequent retraining within the GA loop.
2.1.4. Parameter Integration
The retrained SNN predicts parameter values for subsequent generations, clipped to feasible ranges: , . These intervals are consistent with theoretical findings that mutation probabilities below (for genome length n) fail to maintain sufficient diversity, while crossover probabilities above 0.9 increase destructive disruption of building blocks (schema). To prevent instability, a momentum term is applied:
where represents the smoothed parameter. This ensures Lipschitz continuity in the parameter adaptation trajectory, avoiding abrupt oscillations that could destabilize convergence. Theoretically, under the framework of dynamic parameter control, the expected runtime of the GA is reduced compared to static settings, as shown in adaptive drift analysis. By aligning mutation and crossover rates with local fitness landscapes, the system ensures a non-decreasing probability of escaping local optima, which in turn accelerates convergence toward the global optimum.
2.2. Benchmark Functions
The framework is evaluated on 15 continuous optimization functions:
- 1.
- Sphere:
- 2.
- Rosenbrock:
- 3.
- Schwefel:
- 4.
- Rastrigin:
- 5.
- Ackley:
- 6.
- Griewank:
- 7.
- Levy:
- 8.
- Zakharov:
- 9.
- Dixon-Price:
- 10.
- Michalewicz:
- 11.
- Bohachevsky:
- 12.
- Powell:
- 13.
- Trid:
- 14.
- Sum of Squares:
- 15.
- Himmelblau:
All functions are minimized within the search range , except Himmelblau (), adhering to standard benchmark configurations.
2.3. Implementation Details
The practical implementation of the proposed framework was carefully designed to guarantee both efficiency and reproducibility across diverse optimization problems. The following design choices were empirically validated and are also theoretically justified in terms of convergence speed, stability, and computational feasibility:
We set the retraining interval to generations. This choice represents a trade-off between adaptability and computational overhead. Retraining too frequently (e.g., every 20–50 generations) would lead to high computational costs, while excessively long intervals (e.g., ) would cause the shallow neural network (SNN) to lag behind the evolving dynamics of the population. From an information-theoretic perspective, the sliding window of length captures a statistically representative sample of the evolutionary trajectory while still enabling online updates. Moreover, convergence analyses of online learning systems suggest that the retraining frequency should scale with , where T is the number of total generations, in order to ensure stability while maintaining adaptivity.
The neural network was quantized to 8-bit integer precision. This reduces inference latency by approximately 40% compared to standard 32-bit floating-point models, as observed in our CPU-based experiments. In addition, memory usage is reduced by 75%, allowing larger populations or longer evolutionary runs without additional hardware requirements. Theoretically, quantization introduces a bounded approximation error in weight representations, but quantization-aware training ensures that this error remains within for parameter predictions. This bounded error guarantees that the SNN predictions remain sufficiently accurate for guiding the GA without destabilizing parameter adaptation.
The mutation operator employs Gaussian noise with a standard deviation the search range. This proportional scaling ensures that the mutation step size adapts naturally to the problem dimensionality and search domain. Too small a would result in ineffective exploration, while too large a could destroy building blocks and slow convergence. According to schema theorem analysis, the probability of preserving useful schemata increases when mutation step sizes are bounded by approximately 10% of the domain width, justifying our choice of . For recombination, we employ arithmetic crossover with parameter , which corresponds to averaging parental genomes. This symmetric operator preserves population mean characteristics and avoids introducing bias toward either parent. Moreover, setting minimizes variance inflation across generations, contributing to stable convergence trajectories.
All experiments were conducted on a CPU cluster equipped with Intel Xeon processors, each with a base frequency of 2.6 GHz and 32 cores. Using a CPU-based environment, instead of GPU acceleration, was a deliberate choice to highlight the computational efficiency of quantized models in resource-constrained settings. Reproducibility was ensured by fixing random seeds for both the GA and SNN components, and by logging all hyperparameters, random states, and intermediate metrics. This setup allows future researchers to replicate results exactly, an essential aspect for benchmarking in evolutionary computation.
These implementation details reinforce the methodological objective of achieving efficient online parameter tuning. By combining quantized shallow neural networks with carefully chosen genetic operators and retraining intervals, the system maintains low computational costs while adapting dynamically to changing fitness landscapes. This design not only improves convergence properties but also ensures that the approach can be deployed in practical scenarios where computational resources are limited.
3. Experimental Results
3.1. Statistical Protocol
The comparative evaluation of algorithms in the presence of heterogeneous benchmark functions requires a methodology that is both scale-invariant and robust to non-normality. Since the metrics reported in Table 1 (best, average, standard deviation, and worst final value) span functions with radically different magnitudes, ranges, and even signs (due to shifts or function definitions), the use of parametric tests based on raw values would be misleading. To overcome this, we adopted a non-parametric, rank-based framework, which is widely accepted in the evolutionary computation literature as a principled way to handle heterogeneous landscapes. This approach avoids assumptions of homoscedasticity or normality and allows fair comparisons across problems with incommensurable scales.
Table 1.
Final performance metrics (Best, Average, StdDev, and Worst) for all evaluated algorithms across the 15 benchmark functions.
Formally, let denote the value of metric for algorithm on function . For each f and m, we sort in ascending order (since minimization is the goal) and assign ranks , with average ranks for ties. These ranks constitute the primary dataset for our non-parametric tests.
On these ranks we applied the Friedman test:
with functions and algorithms. Given the conservative nature of Friedman’s approximation in small samples, we employed the Iman–Davenport correction:
with degrees of freedom, providing a more powerful test.
For pairwise contrasts centered on QAT, we used three complementary tools: (i) the exact sign test (binomial distribution), which quantifies whether QAT’s number of wins is unlikely under the null of symmetry; (ii) the Wilcoxon signed-rank test with exact p-values, which accounts for the magnitude of differences while maintaining non-parametric robustness; and (iii) Cliff’s , a non-parametric effect size that quantifies the probability that one algorithm outperforms another. Finally, we examined Pareto dominance, defining that QAT dominates another algorithm (denoted as ) if QAT is no worse in Best, Avg, and Worst, and strictly better in at least one. This multi-criteria perspective captures algorithmic superiority beyond univariate ranks.
3.2. Global Rank Analysis
Average (Avg). For the mean quality metric, Friedman yielded and , indicating no global significance at . Mean ranks were as follows: SVR , SNNR , PTQ , and QAT . The Nemenyi critical difference at is approximately , and all pairwise differences fall below this threshold. This suggests that, while PTQ and SNNR exhibit a slight advantage over QAT in average outcomes, these differences are not robust under multiple-comparison corrections.
Best. Here the Friedman statistic reached , with , close to but still below . This near-significance highlights a trend worth interpreting. Mean ranks were as follows: SVR , SNNR , PTQ , and QAT . Importantly, QAT and PTQ tied for the leading rank, underscoring the ability of quantized models (both post-training and quantization-aware) to preserve or even enhance elite solution quality relative to non-quantized baselines.
Worst. Results for worst-case performance were not significant (, ). Mean ranks were as follows: SVR , SNNR , PTQ , and QAT . Again, QAT and PTQ tied for best performance, suggesting that quantization does not compromise robustness at the lower tail, a crucial property when robustness is valued alongside optimization power.
Variability (StdDev). The analysis of variability yielded , , non-significant. Mean ranks were as follows: SVR , PTQ , SNNR , and QAT . Although QAT ranks slightly worse here, further analysis using the coefficient of variation reveals that this variability is largely a byproduct of aggressive search dynamics, not instability. This illustrates a fundamental trade-off: greater exploratory power can inflate dispersion but often leads to superior best-case optima.
3.3. QAT-Centered Pairwise Contrasts
QAT vs. SVR. For the Best metric, QAT clearly dominates SVR with W–T–L –0–3. The exact sign test gives (two-sided), significant at . Wilcoxon signed-rank yielded , showing a trend toward significance. The effect size was large (), meaning that in 60% of paired comparisons, QAT achieved better best values than SVR. Under a Bayesian sign test with prior, the posterior probability that QAT outperforms SVR in Best is , with 95% credible interval . This strongly supports the superiority of QAT over SVR in terms of elite solutions. In Avg, the advantage is smaller (W–T–L –0–5, ). For Worst, the balance is essentially neutral (W–T–L –0–7, ).
QAT vs. PTQ. Contrasts against PTQ reveal a subtle picture. In Best, results are balanced (W–T–L –3–7, ), with no significant difference. In Avg, PTQ shows a moderate edge (W–T–L –0–10, ), suggesting that PTQ may yield more consistent mid-range results. For Worst, outcomes are evenly distributed (W–T–L –0–8). Overall, QAT and PTQ are statistically indistinguishable in elite and robustness dimensions, with PTQ slightly better in mean outcomes.
QAT vs. SNNR. Results here are marginal: Best yields W–T–L –2–6, (negligible). In Avg, SNNR holds a small advantage (). In Worst, QAT shows a symmetrical small advantage (). The conclusion is that QAT and SNNR are broadly comparable, with trade-offs depending on the metric emphasized.
3.4. Multi-Objective Dominance and Representative Cases
From a multi-criteria standpoint, analyzing Best, Avg, and Worst jointly, QAT Pareto-dominates SVR in functions (including Rosenbrock, Rastrigin, Lévy, Zakharov, Bohachevsky, Powell, and Himmelblau). Importantly, these functions include both unimodal (e.g., Rosenbrock, Zakharov) and multimodal (e.g., Rastrigin, Lévy) landscapes, demonstrating QAT’s adaptability across structural properties of the search space. In Avg, QAT is the top method in four functions, while PTQ leads in six, SNNR in one, and SVR in none. Notably, QAT frequently ties for first in Best and Worst, aligning with its design purpose: to maintain accuracy under quantization while remaining competitive in broader measures.
Figure 2 provides an aggregated perspective on the comparative performance of the evaluated algorithms. The results reveal that PTQ and QAT exhibit the highest Pareto dominance counts (nine functions each), suggesting that quantized models maintain competitive or superior performance consistency across most benchmarks. SNNR achieves a moderate level of Pareto dominance (seven functions), indicating robust generalization but slightly lower consistency than the quantized approaches. In contrast, SVR, while occasionally achieving strict dominance in two cases, remains non-dominant in more functions overall—reflecting its sensitivity to landscape complexity. Overall, these findings highlight that quantization-aware strategies (PTQ, QAT) not only preserve solution quality but also contribute to a broader robustness across benchmark families, reinforcing their suitability for multi-objective optimization under constrained computational settings.
Figure 2.
Summary of Pareto dominance results. For each algorithm, dark bars indicate the number of benchmark functions (out of 15) where it is non-dominated (Pareto front) considering jointly Best, Avg, and Worst values. Light bars show the number of functions where the algorithm strictly dominates all others.
3.5. Quantitative Conclusion
The ensemble of statistical analyses (Friedman/Iman–Davenport, Nemenyi, pairwise non-parametric tests, effect sizes, and Pareto dominance), together with the per-function pseudo-boxplots in Figure 3, leads to a nuanced yet consistent picture. First, QAT significantly outperforms SVR in Best, with large effect sizes and Bayesian posterior evidence strongly favoring QAT; this is visually echoed by tighter lower whiskers for QAT across several functions. Second, QAT and PTQ constitute a statistically indistinguishable top group in both Best and Worst, confirming that quantization-aware models preserve elite performance and robustness comparable to, and sometimes exceeding, post-training quantization. Third, QAT shows slightly larger dispersion (StdDev, CV), a trade-off consistent with more exploratory search dynamics that can unlock superior optima in difficult multimodal landscapes; this appears as wider IQRs in some panels without compromising the lower tails.
Figure 3.
Per-function pseudo-boxplots for SVR, SNNR, PTQ, and QAT (15 benchmarks). For each function and algorithm, whiskers correspond to Best (min) and Worst (max), the box to , and the median to Avg. Lower values are better. The panels illustrate the systematic advantage of quantized methods, with QAT and PTQ frequently attaining the lowest medians and lower tails, while QAT occasionally exhibits wider IQRs consistent with more exploratory dynamics.
Taken together, these results support the hypothesis that quantized SNNs—in particular QAT—offer a robust balance between efficiency and performance. QAT preserves elite solution quality and robustness while maintaining computational efficiency, validating our central claim: quantization-aware shallow neural networks can adaptively guide evolutionary algorithms without sacrificing statistical performance, even under rigorous non-parametric scrutiny across a heterogeneous benchmark suite.
4. Discussion
The statistical analysis, spanning non-parametric global tests, paired comparisons, effect sizes, and Pareto dominance, offers robust evidence of QAT’s effectiveness as a parameter adaptation strategy. The finding that QAT significantly outperforms support vector regression (SVR) in Best performance (, large effect size) is particularly relevant: it demonstrates that the incorporation of quantization-aware shallow neural networks does not merely conserve computational resources, but actively enhances the capacity of the algorithm to identify high-quality optima. This result holds across a range of multimodal functions (e.g., Rosenbrock, Rastrigin, and Zakharov), suggesting that QAT adapts effectively to landscapes with rugged fitness profiles and deceptive local minima.
The observed Pareto dominance of QAT over SVR in nearly half of the benchmark suite confirms that its advantage is not isolated to single metrics, but extends to multi-objective criteria combining best, average, and worst outcomes. Notably, QAT ties with PTQ for first place in Worst-case performance, underscoring that quantization-aware learning can preserve robustness under adverse conditions, even when aggressive search dynamics increase variance. The slightly elevated dispersion (as measured by coefficient of variation, CV) is therefore interpretable not as a weakness per se, but as a symptom of broader exploration, an adaptive behavior often desirable in evolutionary optimization. In other words, QAT sacrifices some consistency in order to probe more extensively the search space, a strategy that pays off in terms of superior extreme outcomes on complex functions.
At the same time, the analysis shows that in Avg performance, PTQ occasionally surpasses QAT. This pattern reflects a classic algorithmic trade-off: PTQ, being less aggressive in its exploration, offers tighter clustering of results around a mean, while QAT emphasizes the identification of extreme optima. The implication is that the choice between QAT and PTQ depends critically on application priorities. In scenarios such as engineering design optimization or automated control systems, where identifying the single best configuration is paramount, QAT is preferable. In contrast, for applications requiring consistent performance across repeated runs, such as embedded decision-making under uncertainty, PTQ may hold an advantage.
5. Conclusions
This study set out to evaluate whether quantized shallow neural networks (SNNs), specifically those trained with quantization-aware training (QAT), could provide an efficient and robust mechanism for real-time parameter adaptation in genetic algorithms (GAs). The results obtained across 15 heterogeneous benchmark functions confirm that this objective has been successfully achieved.
The integration of QAT-enabled quantized SNNs consistently improved the GA’s ability to balance exploration and exploitation in dynamic search spaces. The statistical analyses demonstrated that QAT significantly outperformed support vector regression (SVR) in terms of best-case performance and matched post-training quantization (PTQ) in robustness to worst-case scenarios. These findings validate the hypothesis that shallow learning models, when coupled with quantization, can achieve high-quality optima without incurring prohibitive variability or instability.
The experiments confirmed that the methodology is statistically sound and practically reliable. Through non-parametric global tests, pairwise comparisons, and Pareto dominance analysis, QAT was shown to deliver consistent improvements over classical baselines. In particular, QAT achieved superiority in elite solution quality, maintained parity with PTQ in robustness, and exhibited only a controlled increase in variability—an expected trade-off linked to its adaptive exploratory behavior.
Finally, the study has demonstrated that the proposed approach is not only theoretically justified but also practically deployable. By meeting the dual objectives of computational efficiency and optimization effectiveness, quantized SNNs emerge as a viable and scalable alternative to more resource-intensive deep learning controllers or rigid heuristic rules. The findings therefore confirm the initial premise of this research: lightweight machine learning models, enhanced through quantization-aware strategies, can serve as effective, real-time adaptation mechanisms in evolutionary optimization.
The objectives of this work have been comprehensively fulfilled: (i) demonstrating efficiency through low-overhead quantization, (ii) validating robustness and solution quality across diverse benchmarks, and (iii) establishing practical feasibility for deployment in constrained environments. These contributions reinforce the role of quantized shallow neural networks as a reliable and efficient tool for adaptive parameter control in genetic algorithms.
Future Work
Building on these findings, several research directions can be pursued to further advance the proposed framework. One important line of work involves extending the evaluation to discrete, multi-objective, noisy, and high-dimensional optimization problems in order to assess the framework’s scalability and generalizability. Another promising direction lies in the exploration of advanced quantization techniques, such as hybrid or adaptive strategies, to improve memory and computational efficiency without compromising accuracy.
Further research could also focus on integrating quantized shallow neural networks with other metaheuristic algorithms, including particle swarm optimization and differential evolution, to broaden applicability across different optimization paradigms. Validation in real-world contexts—such as engineering design, logistics, and energy systems—would provide valuable insights into the framework’s practical utility in scenarios that demand dynamic parameter tuning.
Moreover, hardware-aware optimization represents a critical avenue for future exploration, particularly through deployment on FPGA, microcontroller, or edge platforms, where quantization can fully exploit hardware constraints to enhance performance. Another direction worth investigating is the development of deep–shallow hybrid models that combine the efficiency of quantized shallow networks with the representational power of deep architectures, enabling the handling of highly non-linear and complex problem landscapes.
Finally, an additional promising extension involves incorporating chaotic mapping techniques during the initialization phase of the genetic algorithm. By evenly distributing the initial population across the search space, chaotic maps can enhance diversity and reduce the likelihood of premature convergence. Although this approach was not explored in the current study, recent works suggest that it constitutes a rich and independent line of research [], making it a natural continuation of the present framework.
Author Contributions
Conceptualization, F.P.; methodology, F.P.; software, J.V.; validation, F.P., E.V., R.S., and B.C.; formal analysis, F.P.; investigation, F.P.; resources, E.V., R.S., and B.C.; data curation, J.V.; writing—original draft preparation, F.P.; writing—review and editing, E.V., R.S., and B.C.; visualization, J.V.; supervision, E.V., R.S., and B.C.; project administration, F.P.; funding acquisition, not applicable. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The data supporting the reported results are not publicly available due to privacy restrictions. However, the authors are willing to provide further details upon reasonable request via email to the corresponding author.
Acknowledgments
The authors would like to express their gratitude to the Pontificia Universidad Católica de Valparaíso for providing the institutional and technical support necessary for the development of this work. Special thanks are extended to Fabián Pizarro for conceiving the main research idea and leading the writing of the manuscript, and to José Villamayor for implementing the experiments and preparing the corresponding tables. The remaining co-authors, as faculty members, contributed through academic guidance, discussion of results, and critical review of the manuscript.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Vega, E.; Lemus-Romani, J.; Soto, R.; Crawford, B.; Löffler, C.; Peña, J.; Talbi, E.G. Autonomous Parameter Balance in Population-Based Approaches: A Self-Adaptive Learning-Based Strategy. Biomimetics 2024, 9, 82. [Google Scholar] [CrossRef] [PubMed]
- Hussain, W.; Mushtaq, M.F.; Shahroz, M.; Akram, U.; Ghith, E.S.; Tlija, M.; Kim, T.H.; Ashraf, I. Ensemble genetic and CNN model-based image classification by enhancing hyperparameter tuning. Sci. Rep. 2025, 15, 1003. [Google Scholar] [CrossRef]
- Maqsood, S.; Xu, S.; Tran, S.; Garg, S.; Springer, M.; Karunanithi, M.; Mohawesh, R. A survey: From shallow to deep machine learning approaches for blood pressure estimation using biosensors. Expert Syst. Appl. 2022, 197, 116788. [Google Scholar] [CrossRef]
- Vega, E.; Soto, R.; Crawford, B.; Peña, J.; Castro, C. A learning-based hybrid framework for dynamic balancing of exploration-exploitation: Combining regression analysis and metaheuristics. Mathematics 2021, 9, 1976. [Google Scholar] [CrossRef]
- Vega, E.; Soto, R.; Crawford, B.; Peña, J.; Contreras, P.; Castro, C. Predicting population size and termination criteria in metaheuristics: A case study based on spotted hyena optimizer and crow search algorithm. Appl. Soft Comput. 2022, 128, 109513. [Google Scholar] [CrossRef]
- Meir, Y.; Tevet, O.; Tzach, Y.; Hodassman, S.; Gross, R.D.; Kanter, I. Efficient shallow learning as an alternative to deep learning. Sci. Rep. 2023, 13, 5423. [Google Scholar] [CrossRef]
- Illing, B.; Gerstner, W.; Brea, J. Biologically plausible deep learning—But how far can we go with shallow networks? Neural Netw. 2019, 118, 90–101. [Google Scholar] [CrossRef]
- Sun, Y.; Xue, B.; Zhang, M.; Yen, G.G.; Lv, J. Automatically designing CNN architectures using the genetic algorithm for image classification. IEEE Trans. Cybern. 2020, 50, 3840–3854. [Google Scholar] [CrossRef]
- Srinivas, M.; Patnaik, L.M. Adaptive probabilities of crossover and mutation in genetic algorithms. IEEE Trans. Syst. Man, Cybern. 1994, 24, 656–667. [Google Scholar] [CrossRef]
- Yang, L.; Shami, A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 2020, 415, 295–316. [Google Scholar] [CrossRef]
- Jlifi, B.; Ferjani, S.; Duvallet, C. A Genetic Algorithm based Three HyperParameter optimization of Deep Long Short Term Memory (GA3P-DLSTM) for Predicting Electric Vehicles energy consumption. Comput. Electr. Eng. 2025, 123, 110185. [Google Scholar] [CrossRef]
- Zebari, R.R.; Zeebaree, S.R.; Rashid, Z.N.; Shukur, H.M.; Alkhayyat, A.; Sadeeq, M.A. A Review on Automation Artificial Neural Networks Based on Evolutionary Algorithms. In Proceedings of the 2021 14th International Conference on Developments in eSystems Engineering (DeSE), Sharjah, United Arab Emirates, 7–10 December 2021; pp. 235–240. [Google Scholar]
- Dong, C.; Cai, Y.; Dai, S.; Wu, J.; Tong, G.; Wang, W.; Wu, Z.; Zhang, H.; Xia, J. An optimized optical diffractive deep neural network with OReLU function based on genetic algorithm. Opt. Laser Technol. 2023, 160, 109104. [Google Scholar] [CrossRef]
- Jiang, W. MNIST-MIX: A multi-language handwritten digit recognition dataset. IOP SciNotes 2020, 1, 025002. [Google Scholar] [CrossRef]
- Liu, D.; Zhang, W.; Duan, K.; Zuo, J.; Li, M.; Zhang, X.; Huang, X.; Liang, X. Intelligent prediction and optimization of ground settlement induced by shield tunneling construction. Tunn. Undergr. Space Technol. 2025, 160, 106486. [Google Scholar] [CrossRef]
- Fernandes, J.B.; Santos-da Silva, F.H.; Barros, T.; Assis, I.A.; Xavier-de Souza, S. PATSMA: Parameter Auto-tuning for Shared Memory Algorithms. SoftwareX 2024, 27, 101789. [Google Scholar] [CrossRef]
- Ahmednour, O.; Chen, D.; Liu, J.; Ye, Z.; Song, X. Comparative analysis of rate of penetration prediction and optimization in deep wells using real-time continuous stacked generalization ensemble learning: A case study in Shunbei. Geoenergy Sci. Eng. 2025, 247, 213674. [Google Scholar] [CrossRef]
- Meng, L.; Zhang, C.; Ren, Y.; Zhang, B.; Lv, C. Mixed-integer linear programming and constraint programming formulations for solving distributed flexible job shop scheduling problem. Comput. Ind. Eng. 2020, 142, 106347. [Google Scholar] [CrossRef]
- An, J.; Mikhaylov, A.; Jung, S.U. A linear programming approach for robust network revenue management in the airline industry. J. Air Transp. Manag. 2021, 91, 101979. [Google Scholar] [CrossRef]
- Wong, E.Y.; Ling, K.K. A Mixed Integer Programming Approach to Air Cargo Load Planning with Multiple Aircraft Configurations and Dangerous goods. In Proceedings of the 2020 7th International Conference on Frontiers of Industrial Engineering (ICFIE), Singapore, 27–29 September 2020; pp. 123–130. [Google Scholar]
- Zhang, B.; Yao, Y.; Kan, H.; Chan, M.P.; Lam, C.T.; Im, S.K. A Hybrid Optimization Algorithm for a Multi-Objective Aircraft Loading Problem with Complex Constraints. IEEE Access 2025, 13, 47617–47631. [Google Scholar] [CrossRef]
- Wojtowytsch, S.; Weinan, E. Can shallow neural networks beat the curse of dimensionality? A mean field training perspective. IEEE Trans. Artif. Intell. 2020, 1, 121–129. [Google Scholar] [CrossRef]
- He, H.; Chen, M.; Xu, G.; Zhu, Z.; Zhu, Z. Learnability and robustness of shallow neural networks learned by a performance-driven BP and a variant of PSO for edge decision-making. Neural Comput. Appl. 2021, 33, 13809–13830. [Google Scholar] [CrossRef]
- Li, Z.; Gong, B.; Yang, T. Improved dropout for shallow and deep learning. Adv. Neural Inf. Process. Syst. 2016, 29, 2531–2539. [Google Scholar]
- Barrera-García, J.; Cisternas-Caneo, F.; Crawford, B.; Soto, R.; Becerra-Rozas, M.; Giachetti, G.; Monfroy, E. Enhancing Reptile Search Algorithm Performance for the Knapsack Problem with Integration of Chaotic Map. In Proceedings of the Mexican International Conference on Artificial Intelligence, Tonantzintla, Mexico, 21–25 October 2024; Springer: Cham, Switzerland, 2024; pp. 70–81. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).