Beyond Neural Solvers: A Critical Review of Machine Learning for Combinatorial Optimization
Abstract
1. Introduction
1.1. Motivation
1.2. Key Contributions
- It introduces a mathematical formulation of the formal basis for understanding learning-based approaches to combinatorial optimization.
- It develops a taxonomy of machine learning techniques for combinatorial optimization.
- It critically evaluates ML-for-CO approaches from a variety of angles, such as feasibility, heuristic selection, solution generation, and solver support, to identify methodological limitations.
- It demonstrates the move in the field from purely end-to-end neural solvers toward hybrid neuro-symbolic optimization that incorporates learned components with symbolic optimization backbones.
- It highlights LLMs and multimodal foundation models as the latest advances in ML-for-CO.
- It determines open research issues that are essential to the field’s advancement.
1.3. Paper Outline
2. Review Protocol
2.1. Research Concern and Questions
2.2. Search Methodology and Selection Criteria
3. Technical Background: Combinatorial Optimization and Learning-Based Solvers
4. Literature Review and Critical Synthesis of ML-Based Combinatorial Optimization
4.1. Reviews and Fundamental Perspectives
4.2. Methods of Reinforcement Learning for Combinatorial Optimization
4.3. Graph Neural Networks and Deep Learning Architectures
4.4. Quantum, Quantum-Inspired, and Ising Machine Methods
4.5. ML-Enhanced Metaheuristics and Hyper-Heuristics
4.6. Predict-Then-Optimize and End-to-End Differentiable Pipelines
4.7. Domain-Specific Applications
4.8. Theoretical Foundations and Robustness
4.9. Large Language Models for Combinatorial Optimization
4.10. Research Gaps
- Distribution-shift robust learning solvers: When test examples deviate from the training distribution, existing trained solutions tend to perform worse. Future techniques should preserve the quality of the solution under various operational settings, structural characteristics, and undiscovered instance sizes.
- Trustworthy neural CO: Formal feasibility and superiority assurances are absent from the majority of neural CO techniques.
- Few-shot and cross-family adaptation: Existing meta-learning work, including [37], mainly targets transfer across related instance distributions. More attention is needed to adaptation across distinct CO problem families, such as routing, scheduling, packing, covering, and graph optimization. We must go beyond simple “black-box” ML and integrate it with symbolic, rule-centered, or formal verification methods that rigorously ensure safety and compliance with constraints in order to trust predictions powered by AI in crucial applications.
- LLM agent-specific standards: Replicability-aware and contamination-assisted benchmarks that include model variations, prompts, deductive estimates, feasibility evaluations, and cost-normalized comparisons with neural, traditional, and metaheuristic baselines are necessary for LLM-driven heuristic synthesis.
- Honest negative outcomes: In the vein of study [59], the field requires further research that clearly investigates whether machine learning enhances performance for a particular CO issue family, including open reporting of neutral or adverse outcomes.
5. Discussion
5.1. Synthesis of ML-for-CO Paradigms: From Standalone Learning to Hybrid Optimization
5.2. Descriptive Trends in the Reviewed Literature
5.3. Cross-Method Comparative Analysis
5.4. Evaluation, Benchmarking, and Reproducibility Challenges
5.5. LLM-Based Optimization: Opportunities and Reliability Challenges
5.6. Cross-Cutting Limitations and Deployment Barriers
- Quantum and specialized hardware remain intriguing but are limited by embedding, mapping, hardware access, and reproducibility restrictions [49].
- LLM-based reliability difficulties (discussed in Section 5.5) highlight the general necessity of cost-normalized review, transparent reporting, and verification.
5.7. Future Research Directions
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| CO | Combinatorial Optimization |
| CP | Constraint Programming |
| CMA-ES | Covariance Matrix Adaptation Evolution Strategy |
| DNN | Deep Neural Network |
| DL | Deep Learning |
| DQN | Deep Q-Network |
| DRL | Deep Reinforcement Learning |
| ETO | Engineer-to-Order |
| GNN | Graph Neural Network |
| GP | Genetic Programming |
| IGD | Inverted Generational Distance |
| IP | Integer Programming |
| LLM | Large Language Model |
| LNS | Large Neighborhood Search |
| Max-Cut | Maximum Cut Problem |
| MAX-SAT | Maximum Satisfiability Problem |
| MCTS | Monte Carlo Tree Search |
| ML | Machine Learning |
| ML-for-CO | Machine Learning for Combinatorial Optimization |
| MOO | Multi-Objective Optimization |
| NAS | Neural Architecture Search |
| NP-hard | Nondeterministic Polynomial-time Hard |
| OR | Operations Research |
| QUBO | Quadratic Unconstrained Binary Optimization |
| RAN | Radio Access Network |
| ReAIM | ReRAM-based Adaptive Ising Machine |
| RL | Reinforcement Learning |
| RMSE | Root Mean Square Error |
| RNN | Recurrent Neural Network |
| SCP | Set Covering Problem |
| SL | Supervised Learning |
| SLA | Service-Level Agreement |
| SME | Small- and Medium-Sized Enterprise |
| SPO | Smart Predict-and-Optimize |
| TSP | Traveling Salesman Problem |
| VM | Virtual Machine |
| VRP | Vehicle Routing Problem |
| WoS | Web of Science |
| XAI | Explainable Artificial Intelligence |
References
- Karimi-Mamaghan, M.; Mohammadi, M.; Meyer, P.; Karimi-Mamaghan, A.M.; Talbi, E.-G. Machine learning at the service of meta-heuristics for solving combinatorial optimization problems: A state-of-the-art. Eur. J. Oper. Res. 2022, 296, 393–422. [Google Scholar] [CrossRef]
- Wang, F.; He, Q.; Li, S. Solving Combinatorial Optimization Problems with Deep Neural Network: A Survey. Tsinghua Sci. Technol. 2024, 29, 1266–1282. [Google Scholar] [CrossRef]
- Sanchez, M.; Cruz-Duarte, J.M.; Carlos Ortiz-Bayliss, J.; Ceballos, H.; Terashima-Marin, H.; Amaya, I. A Systematic Review of Hyper-Heuristics on Combinatorial Optimization Problems. IEEE Access 2020, 8, 128068–128095. [Google Scholar] [CrossRef]
- Mazyavkina, N.; Sviridov, S.; Ivanov, S.; Burnaev, E. Reinforcement learning for combinatorial optimization: A survey. Comput. Oper. Res. 2021, 134, 105400. [Google Scholar] [CrossRef]
- Bolufé-Röhler, A.; Tamayo-Vera, D. Machine Learning for Enhancing Metaheuristics in Global Optimization: A Comprehensive Review. Mathematics 2025, 13, 2909. [Google Scholar] [CrossRef]
- Zhang, C.; Wu, Y.; Ma, Y.; Song, W.; Le, Z.; Cao, Z.; Zhang, J. A review on learning to solve combinatorial optimisation problems in manufacturing. IET Collab. Intell. Manuf. 2023, 5, e12072. [Google Scholar] [CrossRef]
- Çetinkaya, İ.O.; Büyüktahtakın, İ.E.; Shojaee, P.; Reddy, C.K. Discovering heuristics with Large Language Models (LLMs) for mixed-integer programs: Single-machine scheduling. Comput. Oper. Res. 2026, 186, 107325. [Google Scholar] [CrossRef]
- Albalkhi, S.Y.; Alotaibi, D.F.; Dimitriou, T.; Ahmad, I. Route Optimization Reimagined: Multi-Modal Large Language Models for Next-Generation Vehicle Routing. IEEE Access 2026, 14, 23835–23865. [Google Scholar] [CrossRef]
- Korte, B.; Vygen, J. Combinatorial Optimization: Theory and Algorithms; Algorithms and Combinatorics; Springer: Berlin/Heidelberg, Germany, 2018; Volume 21. [Google Scholar] [CrossRef]
- Garey, M.R.; Johnson, D.S. Computers and Intractability: A Guide to the Theory of NP-Completeness, 27th ed.; A Series of Books in the Mathematical Sciences; W. H. Freeman & Co.: New York, NY, USA, 2009. [Google Scholar]
- Cantürk, F.; Varol, T.; Aydoğan, R.; Özener, O.Ö. Scalable Primal Heuristics Using Graph Neural Networks for Combinatorial Optimization. J. Artif. Intell. Res. 2024, 80, 327–376. [Google Scholar] [CrossRef]
- Mandi, J.; Demirovic, E.; Stuckey, P.J.; Guns, T. Smart Predict-and-Optimize for Hard Combinatorial Optimization Problems. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Inelegances; Association for the Advancement of Artificial Intelligence: New York, NY, USA, 2020; Volume 34, pp. 1603–1610. [Google Scholar] [CrossRef]
- Charytitsch, B.C.B.; Nascimento, M.C.V. An efficient hybridization of Graph Representation Learning and metaheuristics for the Constrained Incremental Graph Drawing Problem. Eur. J. Oper. Res. 2026, 330, 381–397. [Google Scholar] [CrossRef]
- Heng, S.; Kim, D.; Kim, T.; Han, Y. How to Solve Combinatorial Optimization Problems Using Real Quantum Machines: A Recent Survey. IEEE Access 2022, 10, 120106–120121. [Google Scholar] [CrossRef]
- Vazirani, V.V. Approximation Algorithms; Springer: Berlin/Heidelberg, Germany, 2003. [Google Scholar] [CrossRef]
- Gambella, C.; Ghaddar, B.; Naoum-Sawaya, J. Optimization problems for machine learning: A survey. Eur. J. Oper. Res. 2021, 290, 807–828. [Google Scholar] [CrossRef]
- Di Caro, G.A.; Maniezzo, V.; Montemanni, R.; Salani, M. Machine learning and combinatorial optimization, editorial. Spectr. 2021, 43, 603–605. [Google Scholar] [CrossRef]
- Yang, X.; Wang, Z.; Zhang, H.; Ma, N.; Yang, N.; Liu, H.; Zhang, H.; Yang, L. A Review: Machine Learning for Combinatorial Optimization Problems in Energy Areas. Algorithms 2022, 15, 205. [Google Scholar] [CrossRef]
- Omidvar, M.N.; Li, X.; Yao, X. A Review of Population-Based Metaheuristics for Large-Scale Black-Box Global Optimization-Part I. IEEE Trans. Evol. Comput. 2022, 26, 802–822. [Google Scholar] [CrossRef]
- Świechowski, M.; Godlewski, K.; Sawicki, B.; Mańdziuk, J. Monte Carlo Tree Search: A review of recent modifications and applications. Artif. Intell. Rev. 2023, 56, 2497–2562. [Google Scholar] [CrossRef]
- Molina-Abril, G.; Calvet, L.; Juan, A.A.; Riera, D. Strategic Decision-Making in SMEs: A Review of Heuristics and Machine Learning for Multi-Objective Optimization. Computation 2025, 13, 173. [Google Scholar] [CrossRef]
- Nayeri, Z.M.; Ghafarian, T.; Javadi, B. Application placement in Fog computing with AI approach: Taxonomy and a state of the art. J. Netw. Comput. Appl. 2021, 185, 103078. [Google Scholar] [CrossRef]
- Boubaker, N.E.H.; Zarour, K.; Guermouche, N.; Benmerzoug, D. A Comprehensive Survey on Resource Management for IoT Applications in Edge-Fog-Cloud Environments. IEEE Access 2025, 13, 111892–111925. [Google Scholar] [CrossRef]
- Palk, M.; Voß, S. Graph Combinatorial Optimization Problems for Blockchain Transaction Network Analysis. Mathematics 2026, 14, 345. [Google Scholar] [CrossRef]
- Burggraef, P.; Wagner, J.; Koke, B.; Steinberg, F. Approaches for the Prediction of Lead Times in an Engineer to Order Environment-A Systematic Review. IEEE Access 2020, 8, 142434–142445. [Google Scholar] [CrossRef]
- Chi, M.; Pang, W.; Wu, X.; Zhao, P.; Li, Y.; Wang, T.; Qian, J.; Xiao, Y.; Wang, L.; Zhou, Y. A generalized neural solver based on LLM-guided heuristic evoluation framework for solving diverse variants of vehicle routing problems. Expert Syst. Appl. 2026, 296, 128876. [Google Scholar] [CrossRef]
- Wu, X.; Wang, D.; Wu, C.; Wen, L.; Miao, C.; Xiao, Y.; Zhou, Y. Efficient Heuristics Generation for Solving Combinatorial Optimization Problems Using Large Language Models. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2; ACM: Toronto, ON, Canada, 2025; pp. 3228–3239. [Google Scholar] [CrossRef]
- Dolatshah, K.; Toroghi Haghighat, A.; Khajehvand, V.; Hosseini Shirvani, M. Sustainable virtual machine placement in heterogeneous cloud data centers: A reinforcement learning-based approach. Computing 2026, 108, 17. [Google Scholar] [CrossRef]
- Ding, X.; Zhang, Y.; Chen, B.; Ying, D.; Zhang, T.; Chen, J.; Zhang, L.; Cerpa, A.; Du, W. Scalable and Efficient Reinforcement Learning for Virtual Machine Rescheduling in Cloud Data Centers. IEEE Trans. Parallel Distrib. Syst. 2026, 37, 1186–1204. [Google Scholar] [CrossRef]
- Zou, R.; Qin, H.; Xiang, Y.; Wu, C. Handling integrated transportation and production scheduling via deep-Q-network-enhanced multi-objective quality–diversity algorithm. Eng. Optim. 2026, 1–39. [Google Scholar] [CrossRef]
- Jesus, A.; Corrêa, A.; Vieira, M.; Marques, C.; Silva, C.; Moniz, S. Enhancing multi-agent deep reinforcement learning for flexible job-shop scheduling through constraint programming. Comput. Oper. Res. 2026, 190, 107428. [Google Scholar] [CrossRef]
- Dantas, A.; do Rego, A.F.; Pozo, A. Using deep Q-network for selection hyper-heuristics. In GECCO ’21: Proceedings of the Genetic and Evolutionary Computation Conference Companion; Association for Computing Machinery: New York, NY, USA, 2021; pp. 1488–1492. [Google Scholar] [CrossRef]
- Gu, S.; Yang, Y. A Deep Learning Algorithm for the Max-Cut Problem Based on Pointer Network Structure with Supervised Learning and Reinforcement Learning Strategies. Mathematics 2020, 8, 298. [Google Scholar] [CrossRef]
- Zhao, F.; Gao, J.; Wang, L.; Sang, H. A Tri-Stage Cooperative Optimization Algorithm with Q-Learning Mechanism for the Multiobjective Distributed Flexible Job Shop Scheduling With Worker Factors. IEEE Trans. Syst. Man Cybern.-Syst. 2026, 56, 1911–1925. [Google Scholar] [CrossRef]
- Xu, M.; Mei, Y.; Zhang, F.; Zhang, M. Niching Genetic Programming to Learn Actions for Deep Reinforcement Learning in Dynamic Flexible Scheduling. IEEE Trans. Evol. Comput. 2026, 30, 61–75. [Google Scholar] [CrossRef]
- Han, S.; Zhang, H.; Li, X.; Yu, J.; Liu, Z.; Zhang, T.; Zheng, X.; Nie, W. Joint Resource Allocation for Underwater Acoustic Cooperative Communication Networks: A Hierarchical Combinatorial Bandit Approach. IEEE Trans. Cogn. Commun. Netw. 2026, 12, 6104–6118. [Google Scholar] [CrossRef]
- Ge, F.; Wang, M.; Chen, D.; Shen, L.; Liu, H. Adaptive Geometry Based Meta-Learning for Multi-Objective Combinatorial Optimization Problems. Intell. Artif. 2026, 20, 53–66. [Google Scholar] [CrossRef] [PubMed]
- Ben Said, A.; Mouhoub, M. Machine Learning and Constraint Programming for Efficient Healthcare Scheduling. Int. J. Softw. Eng. Knowl. Eng. 2026, 36, 1089–1120. [Google Scholar] [CrossRef]
- Guo, X.; Zhang, P.; Cai, Q.; Zhang, Y. Learning to solve combinatorial optimization problems with heterophily. Neural Netw. 2025, 189, 107554. [Google Scholar] [CrossRef] [PubMed]
- Garmendia, A.I.; Ceberio, J.; Mendiburu, A. Neural Improvement Heuristics for Graph Combinatorial Optimization Problems. IEEE Trans. Neural Netw. Learn. Syst. 2024, 35, 18300–18312. [Google Scholar] [CrossRef] [PubMed]
- Wang, R.; Hua, Z.; Liu, G.; Zhang, J.; Yan, J.; Qi, F.; Yang, S.; Zhou, J.; Yang, X. A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs. In Advances in Neural Information Processing Systems 34 (NEURIPS 2021); Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J., Eds.; Curran Associates Inc.: Red Hook, NY, USA, 2021; Volume 34. [Google Scholar]
- Bu, F.; Jo, H.; Lee, S.Y.; Ahn, S.; Shin, K. Tackling prevalent conditions in unsupervised combinatorial optimization: Cardinality, minimum, covering, and more. In ICML’24: Proceedings of the 41st International Conference on Machine Learning; MLResearch Press: Norfolk, MA, USA, 2024; Volume 235, pp. 4696–4729. [Google Scholar]
- Liu, Y.; Zhou, C.; Zhang, P.; Gao, Y.; Li, Z.; Chen, H. Meta-Heuristics Graph Neural Architecture Search for Combinatorial Optimization. IEEE Trans. Emerg. Top. Comput. Intell. 2025. [Google Scholar] [CrossRef]
- Slysz, M.; Grodzki, Ł.; Rydlichowski, P.; Siera, D.; Kurowski, K.; Waligóra, G.; Węglarz, J. Solving combinatorial optimization and machine learning problems on hybrid near-term quantum photonic computers. Future Gener. Comput. Syst. 2026, 174, 107934. [Google Scholar] [CrossRef]
- Jiang, J.-R.; Chu, C.-W. Classifying and Benchmarking Quantum Annealing Algorithms Based on Quadratic Unconstrained Binary Optimization for Solving NP-Hard Problems. IEEE Access 2023, 11, 104165–104178. [Google Scholar] [CrossRef]
- Zeng, Q.-G.; Cui, X.-P.; Liu, B.; Wang, Y.; Mosharev, P.; Yung, M.-H. Performance of quantum annealing inspired algorithms for combinatorial optimization problems. Commun. Phys. 2024, 7, 249. [Google Scholar] [CrossRef]
- Shukla, A.; Erementchouk, M.; Mazumder, P. Non-binary dynamical Ising machines for combinatorial optimization. Phys. Nonlinear Phenom. 2025, 481, 134809. [Google Scholar] [CrossRef]
- Cen, Y.; Das, D.; Fong, X. A tree search algorithm towards solving Ising formulated combinatorial optimization problems. Sci. Rep. 2022, 12, 14755. [Google Scholar] [CrossRef] [PubMed]
- Zaman, M.; Tanahashi, K.; Tanaka, S. PyQUBO: Python Library for Mapping Combinatorial Optimization Problems to QUBO Form. IEEE Trans. Comput. 2022, 71, 838–850. [Google Scholar] [CrossRef]
- Truger, F.; Beisel, M.; Barzen, J.; Leymann, F.; Yussupov, V. Selection and Optimization of Hyperparameters in Warm-Started Quantum Optimization for the MaxCut Problem. Electronics 2022, 11, 1033. [Google Scholar] [CrossRef]
- Hao, T.; Huang, X.; Jia, C.; Peng, C. A Quantum-Inspired Tensor Network Algorithm for Constrained Combinatorial Optimization Problems. Front. Phys. 2022, 10, 906590. [Google Scholar] [CrossRef]
- Ahsan Khandoker, S.; Munshad Abedin, J.; Hibat-Allah, M. Supplementing recurrent neural networks with annealing to solve combinatorial optimization problems. Mach. Learn. Sci. Technol. 2023, 4, 015026. [Google Scholar] [CrossRef]
- Chiang, H.-W.; Nien, C.-F.; Cheng, H.-Y.; Huang, K.-P. ReAIM: A ReRAM-based Adaptive Ising Machine for Solving Combinatorial Optimization Problems. In 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA); IEEE: Piscataway, NJ, USA, 2024; pp. 58–72. [Google Scholar] [CrossRef]
- Huang, K.-P.; Nien, C.-F.; Zhang, Y.-T.; Lee, C.-K.; Wang, Y.-C. GPU-based Ising Machine for Solving Combinatorial Optimization Problems with Enhanced Parallel Tempering Techniques. In 2024 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS); IEEE: Piscataway, NJ, USA, 2024; pp. 636–640. [Google Scholar] [CrossRef]
- Garcia, J.; Lemus-Romani, J.; Altimiras, F.; Crawford, B.; Soto, R.; Becerra-Rozas, M.; Moraga, P.; Paz Becerra, A.; Pena Fritz, A.; Rubio, J.-M.; et al. A Binary Machine Learning Cuckoo Search Algorithm Improved by a Local Search Operator for the Set-Union Knapsack Problem. Mathematics 2021, 9, 2611. [Google Scholar] [CrossRef]
- Crawford, B.; Caballero, H.; Astorga, G.; Cisternas-Caneo, F.; Becerra-Rozas, M.; Baeza, A.; Bernales, G.; Puga, P.; Giachetti, G.; Soto, R. A Novel Binary Dream Optimization Algorithm with Data-Driven Repair for the Set Covering Problem. Biomimetics 2026, 11, 197. [Google Scholar] [CrossRef] [PubMed]
- Chen, C.; Wu, J.; Chen, J.; Xia, Y.; Precup, R.-E. Learning-Guided Adaptive Search Optimization for the Weighted Independent Set Problem. Rom. J. Inf. Sci. Technol. 2026, 29, 89–99. [Google Scholar] [CrossRef]
- Hunter, K.; Thomson, S.L.; Hart, E. Variable Importance Estimation for High-Dimensional Optimisation. In Advances in Computational Intelligence Systems; Hart, E., Horvath, T., Tan, Z., Thomson, S., Eds.; Springer Nature: Cham, Switzerland, 2026; pp. 115–126. [Google Scholar] [CrossRef]
- Dell’Amico, M.; Franchini, G.; Magnani, M.; Zanni, L. Can machine learning help in solving the pallet loading optimization problem? J. Heuristics 2026, 32, 11. [Google Scholar] [CrossRef]
- Ɖurasević, M.; Ɖumić, M.; Gala, F.J.G. Selection of Automatically Designed Heuristics for the Container Relocation Problem. In Advances in Computational Intelligence Systems; Hart, E., Horvath, T., Tan, Z., Thomson, S., Eds.; Springer Nature: Cham, Switzerland, 2026; pp. 15–27. [Google Scholar] [CrossRef]
- Chen, R.; Liu, D.; Jiang, N.; Gupta, R.; Kilinc, M.; Lodi, A. Learning large neighborhood search for maritime inventory routing optimization. Int. Trans. Oper. Res. 2026. [Google Scholar] [CrossRef]
- Xie, J.; Zhan, J.; Zhu, X. From recursion to prediction: Modeling backtracking effort in TSP with machine learning. PeerJ Comput. Sci. 2026, 12, e3516. [Google Scholar] [CrossRef]
- El Balghiti, O.; Elmachtoub, A.N.; Grigas, P.; Tewari, A. Generalization Bounds in the Predict-Then-Optimize Framework. Math. Oper. Res. 2023, 48, 2043–2065. [Google Scholar] [CrossRef]
- Kotary, J.; Fioretto, F.; Van Hentenryck, P. Learning Hard Optimization Problems: A Data Generation Perspective. In 35th Annual Conference on Neural Information Processing Systems (NeurIPS); Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J., Eds.; Curran Associates Inc.: Red Hook, NY, USA, 2021; Volume 34. [Google Scholar]
- Vejar, B.; Aglin, G.; Mahmutogullari, A.I.; Nijssen, S.; Schaus, P.; Guns, T. An Efficient Structured Perceptron for NP-Hard Combinatorial Optimization Problems. In International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research, Patr II, CPAIOR 2024; Lecture Notes in Computer Science; Dilkina, B., Ed.; Springer Nature: Cham, Switzerland, 2024; Volume 14743, pp. 253–262. [Google Scholar] [CrossRef]
- Paulus, A.; Rolinek, M.; Musil, V.; Amos, B.; Martius, G. CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints. In Proceedings of the 38th International Conference on Machine Learning; Meila, M., Zhang, T., Eds.; MLResearch Press: Norfolk, MA, USA, 2021; Volume 139, pp. 1–11. [Google Scholar]
- Shen, Y.; Sun, Y.; Li, X.; Eberhard, A.; Ernst, A. Adaptive solution prediction for combinatorial optimization. Eur. J. Oper. Res. 2023, 309, 1392–1408. [Google Scholar] [CrossRef]
- Li, M.; Kolouri, S.; Mohammadi, J. Learning to Solve Optimization Problems with Hard Linear Constraints. IEEE Access 2023, 11, 59995–60004. [Google Scholar] [CrossRef]
- Prat, E.; Chatzivasileiadis, S. Learning Active Constraints to Efficiently Solve Linear Bilevel Problems: Application to the Generator Strategic Bidding Problem. IEEE Trans. Power Syst. 2023, 38, 2376–2387. [Google Scholar] [CrossRef]
- Kumar, M.; Kolb, S.; Teso, S.; De Raedt, L. Learning MAX-SAT from Contextual Examples for Combinatorial Optimisation. Proc. AAAI Conf. Artif. Intell. 2020, 34, 4493–4500. [Google Scholar] [CrossRef]
- Marino, R. Learning from survey propagation: A neural network for MAX-E-3-SAT. Mach. Learn.-Sci. Technol. 2021, 2, 035032. [Google Scholar] [CrossRef]
- Liu, J.; Gao, F.; Zhang, J. Gumbel-Softmax Optimization: A Simple General Framework for Combinatorial Optimization Problems on Graphs. In Complex Networks and Their Applications VIII, Volume 1; Studies in Computational Intelligence; Cherifi, H., Gaito, S., Mendes, J., Moro, E., Rocha, L., Eds.; Springer: Cham, Switzerland, 2020; Volume 881, pp. 879–890. [Google Scholar] [CrossRef]
- Gu, S.; Hao, T.; Yao, H. A pointer network based deep learning algorithm for unconstrained binary quadratic programming problem. Neurocomputing 2020, 390, 1–11. [Google Scholar] [CrossRef]
- Lee, S.; Sohn, S.S.; Lee, H.-S.; Kim, D.; Kang, Y. Accelerating High-Entropy Alloy Design via Machine Learning: Predicting Yield Strength from Composition. Materials 2026, 19, 196. [Google Scholar] [CrossRef] [PubMed]
- Ghiara, E.; Wu, Z.; Voulhoux, M.; Mola Bertran, O.; Bertini, V.; Torres, C.; Kethamkuzhi, A.; Telles, G.T.; Fuentes, V.; Pach, E.; et al. High-Throughput Screening of REBCO Superconducting Thin Films Fabricated Via Combinatorial Inkjet Printing and TLAG Process (Adv. Mater. Technol. 9/2026). Adv. Mater. Technol. 2026, 11, e70944. [Google Scholar] [CrossRef]
- Figalli, A.; Qasim, S.R.; Owen, P.; Serra, N. Designing particle physics experiments with artificial intelligence. Front. Phys. 2026, 14, 1765091. [Google Scholar] [CrossRef]
- Furusawa, S.; Dogo, C.; Saito, K.; Seki, Y.; Kikuchi, S.; Tanaka, S. Comparative evaluation of black-box optimization methods for RAN func-tion placement problem. IEICE Commun. Express 2026, 15, 21–24. [Google Scholar] [CrossRef]
- Uddin, A.; Sakr, A.H.; Zhang, N. Multi-Agent Task Prioritization and Offloading in Vehicular Edge Computing Environments. In 2025 IEEE 102nd Vehicular Technology Conference (VTC2025-Fall); IEEE: Piscataway, NJ, USA, 2025; pp. 1–6. [Google Scholar] [CrossRef]
- Jiang, N.; Yan, S.; Liu, H.; Peng, M. Computation Offloading for Distributed Learning in Vehicular Networks: A Service Scheduling and Resource Allocation Method. IEEE Trans. Veh. Technol. 2026, 75, 3222–3237. [Google Scholar] [CrossRef]
- C-Sánchez, E.; Gomez, J.F. Enhancing hotel profitability: Dynamic pricing with a Sim-Learnheuristic approach. Int. J. Hosp. Manag. 2026, 133, 104472. [Google Scholar] [CrossRef]
- Gamarnik, D.; Jagannath, A.; Wein, A.S. Low-Degree Hardness of Random Optimization Problems. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS 2020); Annual IEEE Symposium on Foundations of Computer Science; IEEE Computer Society: Piscataway, NJ, USA, 2020; pp. 131–140. [Google Scholar] [CrossRef]
- Goldenberg, E.; Karthik, C.S. Hardness Amplification of Optimization Problems. In 11th Innovations in Theoretical Computer Science Conference, (ITCS-2020); Leibniz International Proceedings in Informatics; Vidick, T., Ed.; Association for Computing Machinery (ACM): New York, NY, USA, 2020; Volume 151. [Google Scholar] [CrossRef]
- Goerigk, M.; Maher, S.J. Generating hard instances for robust combinatorial optimization. Eur. J. Oper. Res. 2020, 280, 34–45. [Google Scholar] [CrossRef]
- Liefooghe, A.; Lopez-Ibanez, M. Many-objective (Combinatorial) Optimization is Easy. In Proceedings of the 2023 Genetic and Evolutionary Computation Conference, (GECCO-2023); Paquete, L., Ed.; Association for Computing Machinery (ACM): New York, NY, USA, 2023; pp. 704–712. [Google Scholar] [CrossRef]
- Erwig, M.; Kumar, P. Explanations for combinatorial optimization problems. J. Comput. Lang. 2024, 79, 101272. [Google Scholar] [CrossRef]
- Timofieva, N.K. Artificial Intelligence Problems and Combinatorial Optimization. Cybern. Syst. Anal. 2023, 59, 511–518. [Google Scholar] [CrossRef]
- Khadka, K.; Chandrasekaran, J.; Lei, Y.; Kacker, R.N.; Kuhn, D.R. A Combinatorial Approach to Hyperparameter Optimization. In Proceedings of the CAIN 2024: IEEE/ACM 3rd International Conference on AI Engineering-Software Engineering for AI; IEEE Computer Society: Piscataway, NJ, USA; Association for Computing Machinery: New York, NY, USA, 2024; pp. 140–149. [Google Scholar] [CrossRef]
- Shao, Z.; Yang, J.; Shen, C.; Ren, S. Learning for Robust Combinatorial Optimization: Algorithm and Application. In IEEE Conference on Computer Communications (IEEE INFOCOM-2022); IEEE: Piscataway, NJ, USA, 2022; pp. 930–939. [Google Scholar] [CrossRef]
- Xu, J.; Yu, L.; Yang, H.; Ji, S.; Wu, P.; Zhang, Y.; Yang, A.; Li, Q.; Li, H.; Zhu, E.; et al. A special machine for solving NP-complete problems. Fundam. Res. 2025, 5, 1743–1749. [Google Scholar] [CrossRef] [PubMed]
- Jena, S.K.; Subramani, K.; Velasquez, A. A Differential Approach for Several NP-hard Optimization Problems. In Artificial Intelligence and Image Analysis, ISAIM 2024, IWCIA 2024; Lecture Notes in Computer Science; Barneva, R., Brimkov, V., Gentile, C., Pacchiano, A., Eds.; Springer Nature: Cham, Switzerland, 2024; Volume 14494, pp. 68–80. [Google Scholar] [CrossRef]






| Taxonomy Family | Definition Criterion | Boundary Explanation |
|---|---|---|
| Supervised neural CO solvers | Direct solution imitation/construction | Categorized here when labeled or near-optimal instances learning. |
| RL-for-CO | Reward-driven sequential decision-making | Categorized here when policy learning is essential. |
| GNN-based optimization | Graph/relational representation | GNNs may arise within RL or hybrid systems but are categorized here if representation is the primary contribution. |
| Learning-enhanced metaheuristics | ML enhances a current metaheuristic | The metaheuristic remains the primary solver. |
| Predict-then-optimize | Learning optimized for downstream decision quality | Whenever decision loss or regret is used to assess prediction. |
| Quantum/Ising methods | QUBO/Ising formulation or specialized hardware | The main contribution is the formulation and computing basis. |
| LLM-assisted optimization | Heuristic/code/model generation | LLMs enable optimization instead of providing certifications. |
| Hybrid neuro-symbolic optimization | Integration pattern | Not independent architecture; it integrates learning components and symbolic solvers. |
| Ref. | Review Scope | CO Problems/ Domains | ML/Optimization Methods Covered | Main Contribution | Critical Limitations |
|---|---|---|---|---|---|
| [1] | ML support for metaheuristics. | General CO, including TSP, VRP, scheduling, and packing. | ML-assisted operator selection; parameter tuning; fitness surrogates; configuration. | Defines major ML usage modes inside metaheuristic search. | Predates LLM/diffusion; limited robustness and deployment evidence. |
| [2] | DNN methods for CO. | TSP, VRP, JSP, KP, and Max-Cut. | Pointer networks; transformers; GNNs; supervised/RL/unsupervised learning. | Provides an architecture vs learning paradigm taxonomy. | Primarily descriptive; non-unified benchmarks. |
| [3] | Systematic review of hyper-heuristics for CO. | Bin packing, scheduling, TSP, and VRP. | Selection/generation hyper-heuristics with RL, classifiers, regressors. | PRISMA-style synthesis of hyper-heuristic CO. | Pre-transformer/GNN era; limited scalability analysis. |
| [4] | Reinforcement learning for CO. | TSP, VRP, KP, JSP, and MIS. | Constructive, improvement, and hybrid RL with pointer/GNN policies. | Canonical RL-for-CO taxonomy; partial unified comparison. | Limited recent transformer/GNN-heavy coverage. |
| [5] | ML-enhanced metaheuristics in global optimization. | Continuous and combinatorial global optimization. | Operator selection; surrogates; hyperparameter learning | Recent ML+MH integration roadmap. | Continuous-optimization bias; CO implications sometimes indirect. |
| [6] | Learning to solve CO problems in manufacturing. | Scheduling, loT sizing, factory routing, JSP/FJSP. | Neural and RL manufacturing solvers. | Connect learning-based CO to industrial manufacturing needs. | Manufacturing-specific; limited LLM/diffusion/hybrid coverage. |
| [14] | CO on real quantum machines. | Max-Cut and Ising-formulated CO problems. | QAOA; quantum annealing on D-Wave/IBM-Q. | Practical quantum-hardware overview for CO. | Hardware-limited; narrow problem coverage. |
| [16] | Optimization problems arising inside ML. | SVMs, clustering, NN verification, compression. | MIP/QP; decomposition; convex relaxations. | Catalogues optimization-for-ML perspective. | Limited learning-to-optimize coverage. |
| [17] | ML-for-CO agenda. | General CO problems. | Position/editorial perspective. | Frames OR research agenda at ML-CO interface. | Brief; no empirical comparison or technical taxonomy. |
| [18] | ML-for-CO in energy applications. | Unit commitment, dispatch, OPF, EV scheduling. | Supervised learning; RL; GNNs. | Maps ML methods to power-system CO tasks. | Single-domain; limited cross-method comparison. |
| [19] | Large-scale black box optimization. | Large-scale continuous and some discrete benchmarks. | Decomposition; cooperative coevolution; EAs. | LSGO basis for surrogate-assisted search. | Continuous-domain focus; CO link is indirect |
| [20] | MCTS review. | Game tree search, planning, partial CO. | MCTS variants; neural-guided search. | Summarizes neural/search integrations. | Game/planning bias; limited CO-specific depth. |
| [21] | Strategic decision-making in SMEs with heuristics/ML. | Production, logistics, marketing mix. | MOO heuristics; ML decision support. | Practice-oriented MOO+ML evidence. | Industry-specific; limited transferability. |
| [22] | Fog application placement. | Placement, assignment, scheduling. | ML; RL; metaheuristics. | Taxonomy of AI-based fog placement. | Fog-specific; CO formulation often sketched. |
| [23] | IoT edge–fog–cloud management. | Assignment, scheduling, offloading, and replication. | ML; RL; metaheuristics; classical heuristics. | Cross-tier resource-management taxonomy. | Architecture-centric; limited ML-for-CO comparison. |
| [24] | Blockchain graph CO. | Anomaly detection, clustering, matching. | Graph CO with ML and heuristic solvers. | Links blockchain analytics to graph CO. | Blockchain-specific; transfer unclear. |
| [25] | ETO lead-time prediction. | Industrial scheduling proxy. | ML regression; scheduling+ML hybrids. | Method taxonomy for ETO lead-time prediction. | Prediction-oriented; CO role is downstream. |
| Ref. | CO Problem/ Domain | RL Formulation | Model/Solver Integration | Evaluation Metrics/Baselines | Critical Limitations |
|---|---|---|---|---|---|
| [28] | Sustainable VM placement | Sustainability-aware placement policy | RL policy for power/SLA/utilization balance | Power, SLA, utilization vs heuristics | Environment-specific objective; weak DRL/CP baselines |
| [29] | VM rescheduling | Dynamic resource-allocation policy | Scalable deep RL rescheduling architecture | SLA, energy/cost, makespan vs heuristics | Production-specific; reward transfer uncertain |
| [30] | Transportation-production scheduling | DQN-assisted multi-objective search | DQN inside QD/MOEA loop | Hypervolume, IGD vs MOEAs | Many hyperparameters; unclear many-objective scalability |
| [31] | Flexible job-shop scheduling | Multi-agent DRL with feasibility filtering | CP-filtered DRL actions | Makespan, feasibility rate | CP filtering cost; FJSP-specific evaluation |
| [32] | Selection hyper-heuristics | Online heuristic-selection policy | DQN selects low-level heuristics | Best objective on benchmarks | Sample inefficient; weak regret/optimality guarantees |
| [33] | Max-Cut | Supervised + policy-gradient RL | LSTM pointer network decoder | Cut value, optimality gap | Sequential decoding limits scale; pre-transformer design |
| [34] | Multi-objective distributed FJSP | Q-learning search control | Q-learning in tri-stage cooperative optimizer | Hypervolume, IGD, makespan | Worker factor specificity; limited transfer |
| [35] | Dynamic flexible scheduling | DRL composes symbolic GP actions | GP + DRL hybrid | Mean tardiness, makespan | Hand-designed action space; interpretability unverified |
| [36] | Underwater acoustic networks | Hierarchical combinatorial bandit | Bandit/RL for joint resource allocation | Throughput, energy efficiency | Stationarity/channel assumptions may fail |
| Ref. | CO Problem/Domain | Graph Representation | Learning Architecture | Evaluation Metrics/Findings | Critical Limitations |
|---|---|---|---|---|---|
| [11] | MIP primal heuristics | Variable-constraint bipartite graph | GNN predicts primal assignments | Primal gap/integral; faster selected MIPLIB closure | Uneven MIP family transfer; weak search integration |
| [13] | Constrained incremental graph drawing | Problem graph features | Graph representation + metaheuristics | Crossing number; runtime | Problem-specific; limited graph-CO transfer |
| [39] | Heterophilic CO: MIS, Max-Cut | Heterophilic graph signals | Heterophily-aware GNN | Approximation ratio | Promising but narrow CO-family evidence |
| [41] | Graph CO | Graph instance representation | Bi-level GNN-guided solver | Solution quality on graph benchmarks | Costly bi-level training; no convergence guarantees |
| [40] | Graph CO: TSP/VRP | Route/problem graph | GNN neural improvement heuristic | Fixed-budget solution quality | Initial solution dependence; weaker on constrained cases |
| [42] | Cardinality/covering constraints | Constraint graph/relaxation | Unsupervised GNN surrogate | Approximation ratio; constraint violations | Problem-specific relaxations; discrete continuous gap |
| [43] | CO solver architecture search | Encoded GNN architectures | Metaheuristic-guided NAS | Optimality gap; wall-clock search | High NAS overhead; uncertain cost-benefit |
| Ref. | CO Formulation | Computing Paradigm | Solver/Platform | Evaluation Metrics/Findings | Critical Limitations |
|---|---|---|---|---|---|
| [44] | CO/ML mapped to quantum routines | Hybrid quantum classical | Photonic quantum hardware | Time-to-solution; photon count | Small instances; narrow classical comparison |
| [45] | QUBO NP-hard formulations | Quantum annealing | D-Wave annealer | Time-to-solution; success probability | Minor embedding overhead; limited classical baselines |
| [46] | Generic QUBO/Ising instances | Quantum-inspired classical optimization | Simulated bifurcation/quantum annealing | Solution quality vs quantum annealing | No demonstrated quantum advantage |
| [47] | Max-Cut/Ising CO | Dynamical Ising machine | Continuous non-binary spin solver | Cut value; runtime | Basin escape theory incomplete |
| [49] | Constrained CO to QUBO | QUBO software tooling | PyQUBO + downstream solvers | Usability; mapping performance | Not a learning method; penalty/solver dependent |
| [50] | Warm-started Max-Cut QAOA | Variational quantum optimization | QAOA hyperparameter optimization | Approximation ratio at low depth | Max-Cut only; hardware/simulation size limits |
| [52] | Ising/CO instances | Neural variational + annealing | RNN ansatz + simulated annealing | Approximation ratio | Limited strong annealing/GNN baselines |
| [53] | Ising/QUBO solving | Specialized analog hardware | ReRAM adaptive Ising machine | Energy/time per solution | Platform-specific; low reproducibility |
| Ref. | CO Problem | Metaheuristic Backbone | Learned Component/ML Role | Evaluation Metrics/Findings | Critical Limitations |
|---|---|---|---|---|---|
| [55] | Set union KP | Cuckoo search + local search | Classifier-guided candidate selection | Best/average objective | No modern exact baselines; dataset-specific tuning |
| [56] | Set covering problem | Binary dream optimization | Data-driven repair | Best objective on OR-Library SCP | Weak theoretical novelty; close baseline gains |
| [57] | Weighted independent set | Adaptive local search | Learning-guided search | Solution quality; runtime | WIS-limited; no broader graph-CO transfer |
| [58] | High-dimensional black box optimization | Surrogate-assisted search | Variable-importance estimation | Convergence speed; active variables | Surrogate-dependent; costly for black box settings |
| [59] | Pallet loading | Classical loading heuristics | ML vs classical heuristic comparison | Filling ratio; runtime | Useful mixed result; limited generality |
| [60] | Container relocation | Automatically designed GP heuristics | Selection among evolved heuristics | Number of relocations | Synthetic training; OOD performance unclear |
| [61] | Maritime inventory routing | Large neighborhood search | Learned neighborhood selection | Cost gap vs MIP; runtime | Private data; limited independent validation |
| Ref. | CO Problem | Prediction/ Learning Target | Optimization Oracle/Strategy | Decision Metric/ Findings | Critical Limitations |
|---|---|---|---|---|---|
| [12] | Hard CO: matching, KP | Unknown downstream parameters | SPO+ surrogate + optimization oracle | Regret vs two-stage learning | Loose surrogate risk; oracle call cost |
| [63] | Predict-then-optimize | Decision-focused generalization | Statistical learning/Rademacher analysis | Sample complexity bounds | Bounds may be loose; solver assumptions idealized |
| [64] | Learning hard optimization | Training instance generation | Active sampling | Solution quality vs random sampling | Expensive and learner-dependent sampling |
| [66] | Learning IP constraints | Constraint recovery | Black box differentiation through ILP | Constraint recovery; decision accuracy | Poor scaling with constraints; sensitive differentiation |
| [67] | MIP solution prediction | Warm-start solutions | Adaptive predictor + exact solver | Time-to-optimality; primal gap | Instance similarity dependence; weak shift analysis |
| [68] | Hard linear constraints | Feasible decision prediction | Differentiable projection | Violation rate; decision quality | Linear only; no integrality/nonlinear feasibility |
| [70] | Contextual MAX-SAT | Hidden formula/constraint learning | Constraint learning from examples | Recovered constraint F1; MAX-SAT score | High sample complexity; feature space dependence |
| [72] | Graph CO: Max-Cut, MIS | Continuous discrete decision relaxation | Gumbel–Softmax relaxation | Solution quality | Relaxation gap; temperature sensitivity |
| Ref. | Application Domain | CO Task | ML/Optimization Method | Metrics and Main Contribution | Transferability Limitations |
|---|---|---|---|---|---|
| [38] | Healthcare | Institutional scheduling | ML predictor + CP | Feasibility, makespan, fairness | Hospital-specific rules limit transfer |
| [74] | High-entropy alloys | Composition-property search | Composition regressor | R2, RMSE for yield strength | Limited by training composition coverage |
| [75] | Materials/superconducting films | Combinatorial materials screening | Inkjet printing + ML analysis | Material yield; screening efficiency | Small libraries; chemistry-specific generalization |
| [76] | Particle physics | Experiment design | Bayesian/RL design | Information gain; sensitivity | Highly domain-specific |
| [77] | Telecommunications/RAN | Function placement | BO, CMA-ES, RL comparison | Latency; deployment cost | Small, nonstandard benchmarks |
| [78] | Vehicular edge computing | Task prioritization/offloading | Counterfactual multi-agent DRL | Latency, energy, fairness; ~6% lower latency | Scalability, mobility, reward, delay assumptions |
| [79] | Vehicular networks | Offloading/resource allocation | Learning-based joint scheduling | Delay, energy, drop ratio | Simplified mobility/channel models; limited reproducibility |
| [80] | Hospitality/revenue management | Hotel dynamic pricing | Simulation + ML + metaheuristics | Revenue/RevPAR uplift | Demand assumptions constrain generality |
| Ref. | Theoretical Focus | CO Setting | Methodological Lens | Formal/Empirical Output and Relevance | Critical Limitations |
|---|---|---|---|---|---|
| [81] | Low-degree hardness | Random optimization | Statistical physics; complexity | Lower bounds; limits efficient learning | Random/worst-case setting may not match industry |
| [82] | Hardness amplification | Optimization reductions | Complexity-theoretic analysis | Hardness magnification factors | Mostly theoretical; limited design guidance |
| [83] | Hard-instance generation | Robust CO | Adversarial instance generation | Runtime and robustness-gap stress tests | Problem-specific generator; not a learning method |
| [84] | Many-objective CO difficulty | Many-objective CO | Metric/decomposition analysis | Hypervolume/decomposition evidence | Metric-dependent; provocative, not universal |
| [85] | CO explanations | Explaining CO solutions | Logical explanation framework | Explanation size; faithfulness; XAI link | Limited empirical integration with neural CO |
| [88] | Learning for robust CO | Optimization under uncertainty | Deep learning + robust optimization | Worst-case and average regret | Uncertainty-set dependence; narrow coverage |
| Ref. | CO Task | LLM Role/Modality | Output/Verification Mechanism | Metrics/Baselines | Critical Limitations |
|---|---|---|---|---|---|
| [7] | MIP single-machine scheduling | Text-based heuristic/code generation | Generated heuristics checked by MIP solvers | Optimality gap; speed up | Closed-model dependence; contamination; reproducibility |
| [8] | Vehicle routing | Multimodal map/text reasoning | Routes verified by classical VRP solvers | Tour cost vs VRP baselines | High inference cost; no certificates; small instances |
| [27] | General CO heuristic generation | Core abstraction prompting; fitness prediction | Generated code/fitness checked against HG baselines | Multi-task performance; reduced evaluation cost | Prompt/model sensitivity; weak unseen transfer |
| [26] | Diverse VRP variants | LLM-guided heuristic evolution | Evolved solver components tested on VRP variants | Routing quality; cross-variant performance | Benchmark-dependent generalization; evolution cost |
| Methodology | Primary Role of ML | Main Methodological Advantage | Main Limitation | Practical Deployment Status |
|---|---|---|---|---|
| Reinforcement learning (RL) | Learn sequential optimization policies | Does not require optimal training labels | Sample inefficiency and reward sensitivity | Practically viable in specific settings |
| Graph neural Networks (GNNs) | Learn graph-structured representations | Permutation equivariance and structural generalization | Feasibility projection and over-smoothing | Under active methodological development |
| ML-enhanced Metaheuristics | Improve classical optimization search | Modular, interpretable, and deployment-friendly | Often problem-specific and weakly transferable | Methodologically mature and deployable |
| Predict-then-optimize | Align learning with downstream decision quality | Strong theoretical grounding | Expensive optimization oracle calls | Practically viable in specific settings |
| Large language models (LLMs) | Generate heuristics, code, and reasoning strategies | Flexible zero-/few-shot adaptation | Verification, reproducibility, and inference cost | Primarily exploratory in practice |
| Quantum/Ising Methods | Solve CO through QUBO/Ising formulations | Unified formulation and hardware acceleration potential | Mapping overhead and limited reproducibility | Restricted to specialized experimental settings |
| Method Family | Benchmark Types | Problem Scale | Statistical Validation Practice | Reproducibility Characteristics |
|---|---|---|---|---|
| RL | Routing, Scheduling | Small–large synthetic instances | Limited significance testing | Seed- and training-sensitive |
| GNNs | Graph, Routing | Small–medium training; larger inference | Ablation, cross-instance evaluation | Architecture- and data-dependent |
| ML-Enhanced Metaheuristics | Routing, Industrial | Medium–large instances | Repeated runs, objective statistics | Generally high |
| Predict-then-Optimize | Resource, Energy | Problem-dependent | Regret and decision quality metrics | Oracle-dependent |
| LLM-Assisted Optimization | Routing, Heuristics | Small–medium experimental studies | Limited statistical validation | Prompt- and model-sensitive |
| Quantum/Ising Methods | QUBO, Max-Cut | Mostly small–medium benchmark instances | Limited cross-platform consistency | Hardware-dependent |
| Evaluation Aspect | Typical Practice | Main Concern |
|---|---|---|
| Benchmarking | Synthetic-dominated evaluation | Weak real-world transferability |
| Metrics | Paradigm-specific performance measures | Limited comparability |
| Generalization | Restricted robustness testing | Uncertain out-of-distribution performance |
| Statistical Validation | Sparse significance analysis | Reliability concerns |
| Reproducibility | Partial experimental disclosure | Replication barriers |
| Reporting Practices | Inconsistent efficiency reporting | Difficult practical assessment |
| LLM/Quantum Evaluation | Prompt- or hardware-dependent evaluation | Validation and accessibility challenges |
| LLM Role | Function in CO | Verification Mechanism | Main Risk |
|---|---|---|---|
| Heuristic generation | Produces constructive or improvement rules. | Benchmark evaluation/solver comparison. | Prompt sensitivity |
| Program synthesis | Generates solver code or heuristic routines. | Unit tests/feasibility checks. | Code errors |
| Tool-augmented agent | Calls solvers, evaluates feedback, revises methods. | External solver validation. | Non-reproducible tool chains |
| Multimodal reasoning | Uses maps, layouts, or spatial context. | Classical VRP/scheduling baselines. | Small-instance evidence |
| Solver configuration | Suggests parameters or search strategies. | Runtime and solution quality tests. | Weak generalization |
| Explanation support | Describes decisions or constraints. | Human/solver consistency checks. | Hallucinated rationale |
| Methodology | Scalability | Robustness | Feasibility | Computational Overhead |
|---|---|---|---|---|
| Reinforcement Learning (RL) | Scales moderately under controlled instance distributions | Sensitive to distribution shift and reward design instability | Limited without hybrid constraint integration | High training and exploration costs |
| Graph Neural Networks (GNNs) | Structurally scalable for sparse and moderately sized graphs | Moderate structural transfer across related graph instances | Dependent on projection, repair, or decoding mechanisms | Moderate-to-high representation and inference cost |
| ML-enhanced Metaheuristics | Highly scalable through preservation of classical optimization backbones | Moderately robust within problem-specific search spaces | Strong due to explicit constraint-aware search procedures | Moderate computational overhead relative to end-to-end neural solvers |
| Predict-then-Optimize | Constrained by repeated optimization oracle evaluations | Limited evidence under significant distribution shift | Strong through optimization-aware learning formulations | High oracle and differentiation overhead |
| Large Language Models (LLMs) | Limited by inference latency and large-model computational requirements | Uncertain due to prompt sensitivity and benchmark contamination risk | Lacks formal feasibility and optimality guarantees | Very high inference and verification cost |
| Quantum/Ising Methods | Restricted by current hardware scalability and embedding constraints | Insufficiently validated across diverse CO settings | Problem-dependent and formulation-sensitive | Specialized hardware and QUBO mapping overhead |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Ibrahim, M.E.A.; Ahmed, A.E.S.; Daadaa, Y. Beyond Neural Solvers: A Critical Review of Machine Learning for Combinatorial Optimization. Mathematics 2026, 14, 2208. https://doi.org/10.3390/math14122208
Ibrahim MEA, Ahmed AES, Daadaa Y. Beyond Neural Solvers: A Critical Review of Machine Learning for Combinatorial Optimization. Mathematics. 2026; 14(12):2208. https://doi.org/10.3390/math14122208
Chicago/Turabian StyleIbrahim, Mostafa E. A., Alaa E. S. Ahmed, and Yassine Daadaa. 2026. "Beyond Neural Solvers: A Critical Review of Machine Learning for Combinatorial Optimization" Mathematics 14, no. 12: 2208. https://doi.org/10.3390/math14122208
APA StyleIbrahim, M. E. A., Ahmed, A. E. S., & Daadaa, Y. (2026). Beyond Neural Solvers: A Critical Review of Machine Learning for Combinatorial Optimization. Mathematics, 14(12), 2208. https://doi.org/10.3390/math14122208

