You are currently on the new version of our website. Access the old version .
  • This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
  • Article
  • Open Access

10 January 2026

A Bi-Level Intelligent Control Framework Integrating Deep Reinforcement Learning and Bayesian Optimization for Multi-Objective Adaptive Scheduling in Opto-Mechanical Automated Manufacturing

,
,
,
,
and
Laser Fusion Research Center, China Academy of Engineering Physics, Mianyang 621000, China
*
Author to whom correspondence should be addressed.
Appl. Sci.2026, 16(2), 732;https://doi.org/10.3390/app16020732 
(registering DOI)
This article belongs to the Special Issue Applications of Advanced Deep Learning Technology in Control and Intelligent Systems

Abstract

The opto-mechanical automated manufacturing process, characterized by stringent process constraints, dynamic disturbances, and conflicting optimization objectives, presents significant control challenges for traditional scheduling and control approaches. We formulate the scheduling problem within a closed-loop control paradigm and propose a novel bi-level intelligent control framework integrating Deep Reinforcement Learning (DRL) and Bayesian Optimization (BO). The core of our approach is a bi-level intelligent control framework. An inner DRL agent acts as an adaptive controller, generating control actions (scheduling decisions) by perceiving the system state and learning a near-optimal policy through a carefully designed reward function, while an outer BO loop automatically tunes the DRL’s hyperparameters and reward weights for superior performance. This synergistic BO-DRL mechanism facilitates intelligent and adaptive decision-making. The proposed method is extensively evaluated against standard meta-heuristics, including Genetic Algorithm (GA) and Particle Swarm Optimization (PSO), on a complex 20-jobs × 20-machines flexible job shop scheduling benchmark specific to opto-mechanical automated manufacturing. The experimental results demonstrate that our BO-DRL algorithm significantly outperforms these benchmarks, achieving reductions in makespan of 13.37% and 25.51% compared to GA and PSO, respectively, alongside higher machine utilization and better on-time delivery. Furthermore, the algorithm exhibits enhanced convergence speed, superior robustness under dynamic disruptions (e.g., machine failures, urgent orders), and excellent scalability to larger problem instances. This study confirms that integrating DRL’s perceptual decision-making capability with BO’s efficient parameter optimization yields a powerful and effective solution for intelligent scheduling in high-precision manufacturing environments.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.