Towards Fault-Tolerant AGV Task Scheduling in Flexible Manufacturing Systems Using a Tree-Based Max-Plus Predictive Approach
Abstract
1. Introduction
- A switching max-plus fault-tolerant predictive tree search algorithm for AGV task assignment with a nonlinear quadratic cost function.
- An IoT integration architecture connecting physical KIS.BOX dispatch devices to a 3D simulation via event-driven WebSocket communication and a lightweight REST API.
- A simulation study in Python/Blender 3D demonstrating reduced variance in task completion times compared to FIFO scheduling, validated under both nominal and fault conditions.
2. Mathematical Model
2.1. Max-Plus Algebra
2.2. System Description
- The task has entered the system;
- The loading station has finished servicing the previous robot;
- The assigned transport robot has completed its previous task and returned to the loading station.
- ,
- ,
- Vectors: ;
- Sparse matrices: ;
- Modified identity matrices: and are standard max-plus identity matrices E with their and diagonal elements replaced by and , respectively.
2.3. Model Predictive Control
2.4. Fault-Tolerant Control
2.5. Problem Statement
- Inputs (Parameters):
- −
- Task arrival times .
- −
- Nominal transport times .
- −
- Loading and unloading durations.
- −
- Current availability state of loading station , destination stations , and robots .
- −
- Current estimated fault delays F.
- Outputs (Decision Variables): The optimal sequence of robot resource assignments over the prediction horizon. (Note: The destination station is dictated by the operator’s input, not the algorithm).
- Objective: To determine an assignment sequence that minimizes a nonlinear quadratic cost function over the prediction horizon . This specific objective heavily penalizes large individual delays to prevent task “starvation” and ensures a balanced fleet workload.
2.6. Comparison of Computational Burden: MILP vs. Proposed Tree-Based Approach
- Mixed-Integer Linear Programming (MILP)
- Linearity constraints: MILP solvers are widely used in the context of max-plus algebra but strictly require linear performance indices (such as the classic linear criterion ).
- Burden with nonlinear criteria: When nonlinear performance criteria are considered (such as the quadratic cost function needed to penalize excessive individual delays), standard MILP formulations become cumbersome. Handling these nonlinearities requires piecewise linearization and the introduction of auxiliary variables, which significantly complicates the model.
- Proposed Approach (Tree-Based Predictive Search)
- Flexibility with nonlinear criteria: The major advantage of the tree-based exhaustive search approach is its ability to permit the direct application of arbitrary, nonlinear objective functions without requiring linearization workarounds.
- Combinatorial explosion: The main limitation of this method is the combinatorial explosion of possible states. The size of the decision space grows exponentially with the prediction horizon, leading to a computational complexity of , where is the fleet size and is the prediction horizon.
- Feasibility limits: For long prediction horizons, the application of a direct search algorithm becomes computationally prohibitive.
- Practical efficiency: In the considered stochastic intralogistics systems, the system lacks long-term deterministic knowledge of future orders, making a restricted prediction horizon (e.g., steps) entirely reasonable. With such a restricted horizon, the exponential growth of the tree does not pose a critical technological barrier. The exhaustive search method remains highly efficient computationally, enabling real-time decision optimization without the risk of converging to local minima.
3. System Architecture
- Physical KIS.BOX devices (Input generators): These act as a stochastic event generator within the system. A button press by an operator determines the physical arrival time of a new task. This action directly dictates the construction of the input vector and the assignment of the destination station to the task. The absence of a predefined schedule tests the algorithm under conditions of complete uncertainty (on-demand operation).
- Blender 3D virtual environment (Controlled plant): This serves as the Digital Twin of the production hall, acting as the physical plant. It is where the decision sequence determined by the optimizer is executed. Crucially, this environment is responsible for measuring and returning the actual task completion times achieved by the robots, denoted as . This feedback enables the calculation of the prediction error and the subsequent update of the fault matrix F in accordance with the FTC mechanism.
- Python-based decision module (Central controller): This acts as the main processing unit, continuously gathering data from the KIS.ME hardware layer (the input vector ) and the Blender 3D environment (the measurements ). At each discrete step k, it evaluates the search tree to determine the optimal resource allocation.
3.1. Introduction to the KIS.ME Platform
- Button 1—cycles the Operational LED to the next active colour in the sequence, allowing the operator to select the target loading station;
- Button 2—confirms the selection, creates a task entry in the queue, switches the LED to flashing mode as an acknowledgement signal, and locks the interface until the assigned robot completes the transport cycle.
- Triggers—initiate the evaluation of a rule;
- Conditions—logical premises (AND/OR);
- Actions—operations executed once the conditions are met.
3.2. Task Selection and Execution Logic
| Algorithm 1 User interaction handling in KIS.BOX |
1: Set initial LED state = WHITE 2: 3: while system is active do 4: if button 1 is pressed then 5: Switch LED to the next colour in the sequence 6: end if 7: if button 2 is pressed then 8: current LED colour 9: Add to 10: Set LED to flashing green 11: Lock button 12: end if 13: if task completed then 14: Unlock button 15: Set LED = WHITE 16: end if 17: end while |
- The task being added to the queue;
- The LED switching to flashing mode;
- The input interface being locked.
- The system receives a feedback signal;
- The device state is reset;
- A new cycle can be initiated.
3.3. Communication Between Blender 3D and KIS.ME
3.4. Communication Architecture
3.5. Event-to-Action Mapping
Error Handling and Reconnection Strategy
3.6. Simulation Logic and Task Management
3.7. Simulation Model and Robot Behaviour
3.8. Task Management
3.9. System Architecture Overview
- Defines three robots (01.robot–03.robot), each with its own path represented as an ordered list of Vector waypoints in 3D space;
- Launches an HTTP server on port 8080 in a dedicated thread (threading.Thread);
- Launches a WebSocket client connecting to the KIS.ME API in a second dedicated thread.
4. Simulation and Discussion
- FIFO strategy: A classic greedy rule. Upon the arrival of a task, the algorithm assigns it to the first available robot. If multiple robots are ready, the fastest one is selected. This method does not analyse future system states.
- Rolling window Hungarian Algorithm (HA): a dynamic baseline approach representing a state-of-the-art benchmark for non-predictive, real-time optimal task allocation [3]. While the mathematical foundation of the Hungarian Algorithm is classical, its application within a dynamic, rolling-horizon framework is universally recognized in the AGV scheduling literature as a premier standard for instantaneous dispatching [3]. Because it optimally solves the linear assignment problem in polynomial time, it represents the absolute best-case scenario for any dispatching system that only optimizes for the immediate, current step without a predictive horizon. At each decision step, a cost matrix of size is generated, where represents the number of pending tasks. For each robot–task combination, an independent 1-step forward prediction is computed to estimate the task completion time. This estimated time is evaluated using the quadratic function and added to the cumulative cost to serve as the corresponding weight in the matrix. A linear assignment optimization is subsequently performed. However, only the optimal assignment for the immediate current task k is executed. The window is then shifted forward, and the process repeats.
- MPC tree: The tree-based algorithm described in Section 2.3, operating with a prediction horizon of .
- FT tree: The fault-tolerant algorithm described in Section 2.4, utilizing a prediction horizon of and a learning parameter of .
- : high task frequency;
- : moderate task frequency;
- : low task frequency.
- Fault-free scenario: No disturbances occurred throughout the entire simulation.
- Multiple fault scenario:
- Fault in robot 1 after step : speed reduced to 50% of its nominal value.
- Obstacle on the path to station 3 after step : distance increased by 20%.
- Fault in robot 2 after step : speed reduced to 50% of its nominal value.
- Obstacle on the path to station 1 after step : distance increased by 10%.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Naldi, L.D.; Galizia, F.G.; Bortolini, M.; Gabellini, M.; Ferrari, E. Unlocking the Potential of Mass Customization Through Industry 4.0: Mapping Research Streams and Future Directions. Appl. Sci. 2025, 15, 7160. [Google Scholar] [CrossRef]
- Lasi, H.; Fettke, P.; Kemper, H.G.; Feld, T.; Hoffmann, M. Industry 4.0. Bus. Inf. Syst. Eng. 2014, 6, 239–242. [Google Scholar] [CrossRef]
- De Ryck, M.; Versteyhe, M.; Debrouwere, F. Automated guided vehicle systems, state-of-the-art control algorithms and techniques. J. Manuf. Syst. 2020, 54, 152–173. [Google Scholar] [CrossRef]
- Atzori, L.; Iera, A.; Morabito, G. The Internet of Things: A survey. Comput. Netw. 2010, 54, 2787–2805. [Google Scholar] [CrossRef]
- Al-Fuqaha, A.; Guizani, M.; Mohammadi, M.; Aledhari, M.; Ayyash, M. Internet of Things: A Survey on Enabling Technologies, Protocols, and Applications. IEEE Commun. Surv. Tutor. 2015, 17, 2347–2376. [Google Scholar] [CrossRef]
- Witczak, M.; Lipiec, B.; Banaszak, Z. Fault-tolerant control-based flexible AGV transportation in a seat assembly system. IFAC-PapersOnLine 2019, 52, 67–72. [Google Scholar] [CrossRef]
- Majdzik, P. A feasible schedule for parallel assembly tasks in flexible manufacturing systems. Int. J. Appl. Math. Comput. Sci. 2022, 32, 51–63. [Google Scholar] [CrossRef]
- Patalas-Maliszewska, J.; Wiśniewski, R.; Zhou, M.; Topczak, M.; Wojnakowski, M. Applying additive manufacturing technologies to a supply chain: A petri net-based decision model. Int. J. Appl. Math. Comput. Sci. 2024, 34, 513–525. [Google Scholar] [CrossRef]
- Nie, W.; Luo, J.; Fu, Y.; Sun, S.; Li, D. Schedule of Flexible Manufacturing Systems Based on Petri Nets and A* Search with a Neural Network Heuristic Function. In Proceedings of the 2020 7th International Conference on Information Science and Control Engineering (ICISCE); IEEE: New York, NY, USA, 2020; pp. 1246–1250. [Google Scholar] [CrossRef]
- Pratissoli, F.; Brugioni, R.; Battilani, N.; Sabattini, L. Hierarchical Traffic Management of Multi-AGV Systems with Deadlock Prevention Applied to Industrial Environments. IEEE Trans. Autom. Sci. Eng. 2024, 21, 3155–3169. [Google Scholar] [CrossRef]
- Heidergott, B.; Olsder, G.J.; van der Woude, J.W. Max Plus at Work: Modeling and Analysis of Synchronized Systems: A Course on Max-Plus Algebra and Its Applications; Princeton University Press: Oxford, UK, 2006; Volume 13. [Google Scholar]
- Al Bermanei, H.; Böling, J.M.; Högnäs, G. Modeling and scheduling of production systems by using max-plus algebra. Flex. Serv. Manuf. J. 2024, 36, 129–150. [Google Scholar]
- van den Boom, T.J.; De Schutter, B. Modelling and control of discrete event systems using switching max-plus-linear systems. Control Eng. Pract. 2006, 14, 1199–1211. [Google Scholar] [CrossRef]
- van den Boom, T.J.; De Schutter, B. Model predictive control of manufacturing systems with max-plus algebra. In Formal Methods in Manufacturing; CRC Press: Boca Raton, FL, USA, 2018; pp. 343–378. [Google Scholar]
- Lin, M.H.; Carlsson, J.G.; Ge, D.; Shi, J.; Tsai, J.F. A review of piecewise linearization methods. Math. Probl. Eng. 2013, 2013, 101376. [Google Scholar] [CrossRef]
- Witczak, M.; Majdzik, P.; Stetter, R.; Lipiec, B. A fault-tolerant control strategy for multiple automated guided vehicles. J. Manuf. Syst. 2020, 55, 56–68. [Google Scholar] [CrossRef]
- Witczak, M.; Seybold, L.; Bulach, E.; Maucher, N. Modern IoT Onboarding Platforms for Advanced Applications: A Practitioner’s Guide to KIS.ME; Studies in Systems, Decision and Control; Springer Nature: Berlin/Heidelberg, Germany, 2023; Volume 476. [Google Scholar] [CrossRef]
- Gubbi, J.; Buyya, R.; Marusic, S.; Palaniswami, M. Internet of Things (IoT): A vision. Future Gener. Comput. Syst. 2013, 29, 1645–1660. [Google Scholar] [CrossRef]
- Lee, J.; Bagheri, B.; Kao, H.A. A Cyber-Physical Systems Architecture for Industry 4.0-Based Manufacturing Systems. Manuf. Lett. 2015, 3, 18–23. [Google Scholar] [CrossRef]
- Tao, F.; Qi, Q.; Liu, A.; Nee, A. Digital Twins and Cyber–Physical Systems toward Smart Manufacturing and Industry 4.0: Correlation and Comparison. Engineering 2018, 5, 653–661. [Google Scholar] [CrossRef]
- Fuller, A.; Fan, Z.; Day, C.; Barlow, C. Digital Twin: Enabling Technologies, Challenges and Open Research. IEEE Access 2022, 10, 108952–108971. [Google Scholar] [CrossRef]
- Muchiri, P.; Pintelon, L. Performance measurement using overall equipment effectiveness (OEE): Literature review and practical application discussion. Int. J. Prod. Res. 2008, 46, 3517–3535. [Google Scholar] [CrossRef]
- Ullah, M.R.; Molla, S.; Siddique, I.M. Optimizing Performance: A Deep Dive into Overall Equipment Effectiveness (OEE) for Operational Excellence. J. Ind. Mech. 2023, 8, 26–40. [Google Scholar] [CrossRef]
- Ng Corrales, L.d.C.; Lambán, M.P.; Hernández Korner, M.E.; Royo, J. Overall Equipment Effectiveness: Systematic Literature Review and Overview of Different Approaches. Appl. Sci. 2020, 10, 6469. [Google Scholar] [CrossRef]
- Alsabbagh, W.; Sayegh, B.; Kim, C.; Amogbonjaye, S.; Patil, N.S.; Marceta, A.; Al-Kadri, O.; Langendorfer, P. MQTT Protocol in Industrial Internet of Things: Today Challenges and Tomorrow Solutions. Preprint 2025, 4. [Google Scholar] [CrossRef]
- Noor, M.b.M.; Hassan, W.H. Current research on Internet of Things (IoT) security: A survey. Comput. Netw. 2019, 148, 283–294. [Google Scholar] [CrossRef]













| Methodology | Handles Alternative Assignments | Cost Function Flexibility | Fault-Tolerance Mechanism | Real-Time Physical IoT Dispatch |
|---|---|---|---|---|
| Standard Max-Plus [11,12,14] | No (Static/Cyclic) | N/A | No | No |
| Max-Plus + MILP [13,15] | Yes | Strictly Linear | No | No |
| Nominal SMPL Tree [13] | Yes | Nonlinear | No | No |
| Our Approach (FTC SMPL + IoT) | Yes | Nonlinear | Yes (Adaptive Delay Estimator) | Yes (Event-driven WebSocket/REST) |
| Parameter | Description |
|---|---|
| Communication | WLAN (Wi-Fi 2.4 GHz) |
| Connectors | M12 (8-pin, A-coded) |
| Power supply | 5 V (USB) or 24 V |
| Inputs/Outputs | 2 buttons + 2 GPIO ports (for 24 V) |
| Protection rating | IP65 |
| ID | Colour | HEX Code | Active |
|---|---|---|---|
| 0 | Blue | #0000FF | ★ |
| 1 | Turquoise | #00FFFF | ★ |
| 2 | Black (OFF) | – | |
| 3 | Green | #00FF00 | |
| 4 | Magenta | #FF00FF | ★ |
| 5 | Red | #FF0000 | ★ |
| 6 | White | #FFFFFF | |
| 7 | Yellow | #FFFF00 | ★ |
| Datapoint | Direction | Type | Description |
|---|---|---|---|
| ledColor | KIS.ME → Blender | Integer (0–7) | Current operational LED colour |
| ledMode | Blender → KIS.ME | Integer | LED mode: static or flashing |
| button1 | KIS.ME → Blender | Boolean | State of button 1 |
| button2 | KIS.ME → Blender | Boolean | State of button 2 |
| deviceStatus | KIS.ME → Blender | String | Device availability status |
| Event (DatapointId) | Value | Action in Blender |
|---|---|---|
| button1 | true | Advance LED colour to next in sequence |
| button2 | true | Create task, add to queue, lock interface |
| ledColor | 0–7 | Update digital twin LED visualisation |
| deviceStatus | offline | Flag robot as unavailable |
| State | Description |
|---|---|
| docked | Robot is at its home position, awaiting a task |
| moving | Robot is travelling to the target location |
| loading | Loading operation in progress |
| unloading | Unloading operation in progress |
| returning | Robot is returning to its home position |
| fault | Fault state |
| Criterion | Description |
|---|---|
| Availability | The robot must be idle |
| Distance | Minimisation of travel distance |
| Workload | Number of currently assigned tasks |
| State | The robot must not be in the fault state |
| Aspect | Description |
|---|---|
| Concurrency | Tasks generated simultaneously by multiple operators |
| Queue | Shared data structure |
| Priorities | Possibility of task prioritisation |
| Datapoint | Description |
|---|---|
| button1Color | Colour identifier of button 1; a transition from white to any active station colour triggers insertion of a new task into the queue |
| button2Color | Colour identifier of button 2, encoding the operator-selected
target station |
| Endpoint | Description |
|---|---|
| GET/status/ | Returns a JSON object containing the current state of all robots, KIS.BOX devices, and fault configuration parameters. Intended for external monitoring and decision-tree queries. |
| GET/{robot}/{destination} | Dispatches the specified robot to the indicated station and schedules its return trip. Accepts optional query parameters speed and station to override defaults at runtime. |
| Interface | Transmitted Data | Omitted Data |
|---|---|---|
| KIS.BOX → Blender | Button press colour change | Continuous LED status, device health |
| Blender → Client | Discrete robot state (5 values) | Full 3D trajectory, interpolated position |
| Client → Blender | Robot ID + destination | Path planning parameters, timing data |
| Fault Scenario | Algorithm | Mean Cost (±σ) | Won Iterations | |
|---|---|---|---|---|
| 1. (no fault) | 40 | FIFO | 2.6% | |
| HA | 0.6% | |||
| MPC/FT Tree | 96.8% | |||
| 50 | FIFO | 0.8% | ||
| HA | 0.0% | |||
| MPC/FT Tree | 99.2% | |||
| 70 | FIFO | 1.0% | ||
| HA | 0.0% | |||
| MPC/FT Tree | 99.0% | |||
| 2. (with faults) | 40 | FIFO | 28.2% | |
| HA | 11.8% | |||
| MPC Tree | 14.6% | |||
| FT Tree | 45.4% | |||
| 50 | FIFO | 24.8% | ||
| HA | 14.8% | |||
| MPC Tree | 7.6% | |||
| FT Tree | 52.8% | |||
| 70 | FIFO | 5.4% | ||
| HA | 33.8% | |||
| MPC Tree | 0.0% | |||
| FT Tree | 60.8% |
| Algorithm (Row vs. Col) | FIFO | HA | MPC/FT Tree | |
|---|---|---|---|---|
| 40 | FIFO | - | 60.8% | 2.6% |
| HA | 39.2% | - | 0.6% | |
| MPC/FT Tree | 97.4% | 99.4% | - | |
| 50 | FIFO | - | 90.6% | 0.8% |
| HA | 9.4% | - | 0.0% | |
| MPC/FT Tree | 99.2% | 100.0% | - | |
| 70 | FIFO | - | 100.0% | 1.0% |
| HA | 0.0% | - | 0.0% | |
| MPC/FT Tree | 99.0% | 100.0% | - |
| Algorithm (Row vs. Col) | FIFO | HA | MPC Tree | FT Tree | |
|---|---|---|---|---|---|
| 40 | FIFO | - | 71.0% | 62.6% | 43.0% |
| HA | 29.0% | - | 39.2% | 25.2% | |
| MPC Tree | 37.2% | 60.8% | - | 32.2% | |
| FT Tree | 57.0% | 74.8% | 67.8% | - | |
| 50 | FIFO | - | 63.2% | 69.6% | 35.8% |
| HA | 36.8% | - | 54.4% | 24.6% | |
| MPC Tree | 30.0% | 45.6% | - | 19.6% | |
| FT Tree | 64.2% | 75.4% | 80.4% | - | |
| 70 | FIFO | - | 23.8% | 89.6% | 12.0% |
| HA | 76.2% | - | 97.4% | 36.4% | |
| MPC Tree | 8.2% | 2.6% | - | 1.2% | |
| FT Tree | 88.0% | 63.6% | 98.8% | - |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Zaborniak, D.; Kasza, P.; Pazera, M.; Witczak, M. Towards Fault-Tolerant AGV Task Scheduling in Flexible Manufacturing Systems Using a Tree-Based Max-Plus Predictive Approach. Sensors 2026, 26, 3898. https://doi.org/10.3390/s26123898
Zaborniak D, Kasza P, Pazera M, Witczak M. Towards Fault-Tolerant AGV Task Scheduling in Flexible Manufacturing Systems Using a Tree-Based Max-Plus Predictive Approach. Sensors. 2026; 26(12):3898. https://doi.org/10.3390/s26123898
Chicago/Turabian StyleZaborniak, Dominik, Paweł Kasza, Marcin Pazera, and Marcin Witczak. 2026. "Towards Fault-Tolerant AGV Task Scheduling in Flexible Manufacturing Systems Using a Tree-Based Max-Plus Predictive Approach" Sensors 26, no. 12: 3898. https://doi.org/10.3390/s26123898
APA StyleZaborniak, D., Kasza, P., Pazera, M., & Witczak, M. (2026). Towards Fault-Tolerant AGV Task Scheduling in Flexible Manufacturing Systems Using a Tree-Based Max-Plus Predictive Approach. Sensors, 26(12), 3898. https://doi.org/10.3390/s26123898

