A Stackelberg Trust-Based Human–Robot Collaboration Framework for Warehouse Picking
Abstract
:1. Introduction
1.1. Motivation
1.2. Our Contributions
- Human trust modeling and evaluation: We define HRC with trust for warehouse picking as the Partially Observable Stochastic Game (POSG) and exploit a Bayesian posterior belief space to assess human trust in real time. Additionally, we use the logistic function to model human fatigue, which is specified as the collaboration efficiency reward function in the POSG.
- Human–robot decision-making based on multi-round trust: We combine the Stackelberg strategy with the Bellman equation to design iterative Stackelberg trust strategy generation. It is solved by a model-based exhaustive strategy search and can achieve the optimal strategy for the robot’s long-term benefits while considering human trust. Furthermore, human strategies are generated using the sigmoid function based on human trust.
- Decision analysis by probabilistic model checking: We formalize the generated human–robot decisions into a Partially Observable Markov Decision Process (POMDP). We specify the properties of the human–robot collaboration framework for warehouse picking as PCTL formulae, such as efficiency, accuracy, trust, and human fatigue. These are verified using the probabilistic model-checking tool PRISM. We establish a no-trust model and a single-round trust-based HRC for comparison with our framework.
1.3. Structure of This Paper
2. Related Works
2.1. Trust in Automation
- (1)
- Offline Trust Models
- (2)
- Online Trust Models
2.2. Warehouse Picking
- (1)
- Manual Picking
- (2)
- Automated Picking
- (3)
- Human–Robot Collaborative Picking
3. Theoretical Foundations
3.1. Game Theory
3.2. Stackelberg Strategy
3.3. Probabilistic Model Checking
4. Stackelberg Trust-Based Human–Robot Collaboration Framework for Warehouse Picking
4.1. HRC Framework Overview
4.1.1. Task Classification
4.1.2. Human–Robot Work States
- If both the robot and the human decide to execute the task, the human checks the robot’s completed work. Frequent occurrences of this scenario disrupt the workflow and reduce efficiency;
- The state in which the robot chooses to execute the task while the human does not, which indicates that the human trusts the robot’s execution results. The frequent occurrence of this state may reduce the accuracy of collaboration;
- The state in which the robot frequently withdraws from tasks, leaving the human to take over. This leads to worker fatigue accumulation and a reduction in efficiency.
4.1.3. Stackelberg Trust-Based HRC Framework
4.2. Trust Prediction Based on the POSG
4.2.1. The POSG Model
- : a set of players (human and robot);
- is a finite set of states;
- is a finite set of actions of player ;
- is a finite set of observations;
- is an observation function;
- is a transition function;
- is the reward function when actions have been jointly played in the joint state ;
- is the initial belief over states.
4.2.2. Reward Function
Efficiency Reward Function Related to Fatigue
Accuracy Reward Function
4.3. Human–Robot Decision-Making Based on Multi-Round Trust
- (1)
- Human Strategy
- (2)
- Multi-Round Robot Strategy
Algorithm 1. Iterative Stackelberg Trust Strategy Generation | ||
Input: | ||
Output: | ||
1 | Initialization; | |
2 | ; | |
3 | Determine effective turn probability accuracy; | |
4 | ): | |
5 | Evaluating human strategies: | |
6 | ; | |
7 | Find a policy set that meets system requirements: | |
8 | perform | |
perform end End | ||
9 | , including the corresponding execution probability of the human | |
10 | Maximum value: | |
11 | : | |
13 |
4.4. Decision Analysis with Probabilistic Model Checking
- is a finite set of states;
- is a finite set of actions;
- is a finite set of observations;
- is an observation function;
- is a transition function;
- is the reward function;
- is the initial belief over states.
- : The probability that a path satisfying path formula satisfies the bound .
- : The expected value of reward formula or or , under reward structure , satisfies the bound .
5. Experiment
5.1. Human Trust and Evaluation
5.2. Human Strategy and Robot Strategy
- No-Trust Model: In this model, the robot ignores human trust during the collaborative process. While human strategies are influenced by their own trust levels, the robot does not consider these trust factors. The robot is fixed to execute regular tasks and opts out of complex tasks, with no adjustments based on the progression of events in the human–robot collaboration.
- Single-Round Trust Strategy Model: In this model, the robot evaluates human trust using the belief space and assesses human strategies based on the evaluated trust. For each task, the robot applies the single-round Stackelberg strategy to achieve optimal benefits within that round:
- Multi-Round Trust Strategy Model: We adopt the multi-round trust strategy into the human–robot collaboration. This model not only considers the impact of trust on the current task but also considers how the robot’s decisions influence future states. It achieves better long-term benefits and adjusts future human trust through the robot’s decision-making process.
5.3. Formal Modeling and Verification
5.3.1. POMDP Model
5.3.2. Model Verification
- (1)
- Trust
- (2)
- Efficiency
- (3)
- Fatigue
- (4)
- Accuracy
6. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Definition of Symbols in the Stackelberg Trust-Based HRC Framework
Symbols | Definition | |
A set of players (human and robot). | ||
The task type at time . | ||
The work states of each player at time . | ||
The task outcome (success/failure) at time . | ||
The human trust state at time . | ||
The actions of each player at time . | ||
The observation at time . | ||
The belief state at time . | ||
A Gaussian distribution with mean and standard deviation . | ||
The observation path at time . | ||
The trust dynamics of trust obeying a Gaussian distribution at time . | ||
Assessment of trust through belief space. |
Appendix B. Definitions of States in
States | Definition | |
environment | The task type in the environment is a normal task. | |
The task type in the environment is a complex task. | ||
robot | The robot is executing tasks. | |
The robot does not execute tasks and is waiting for the human to execute tasks. | ||
human | The human is checking the robot’s execution results. | |
The human does not check the robot’s execution results. | ||
The human is re-executing the task. |
Appendix C. Definition of Symbols in the Reward Function
Symbols | Definition |
The reward functions on collaborative efficiency. | |
The reward functions on collaborative efficiency. | |
The regular time reward function. | |
The time reward function is associated with human fatigue. | |
The fatigue level at time . | |
The fatigue levels under work states. | |
The fatigue levels under rest states. | |
The maximum fatigue level. | |
The sigmoid function. | |
The coefficients of fatigue growth rate. | |
The coefficients of fatigue decay rate. | |
The most recent time point at which a switch between work and rest states occurred. | |
The time point when the fatigue levels reach half of the maximum. | |
The impact of fatigue on execution efficiency. |
Appendix D. Definition of Symbols in Human–Robot Decision-Making
Symbols | Definition |
Probability of execution for an action of each player at time . | |
The strategies of each player . | |
Predictions for human strategies. | |
The optimal strategies for the robot. | |
The probability function of the robot strategy set that meets the system’s minimum requirements. | |
The task execution accuracy at time . | |
The system’s requirement for the minimum accuracy of picking tasks. | |
The discount factor for probability. | |
The discount factor for reward. |
References
- Hua, L.; Wu, Y. Strategic Analysis of Vertical Integration in Cross-Border e-Commerce Logistics Service Supply Chains. Transp. Res. Part E Logist. Transp. Rev. 2024, 188, 103626. [Google Scholar] [CrossRef]
- Akıl, S.; Ungan, M.C. E-Commerce Logistics Service Quality: Customer Satisfaction and Loyalty. J. Electron. Commer. Organ. (JECO) 2022, 20, 1–19. [Google Scholar] [CrossRef]
- Dong, Z. Construction of Mobile E-Commerce Platform and Analysis of Its Impact on E-Commerce Logistics Customer Satisfaction. Complexity 2021, 2021, 1–13. [Google Scholar] [CrossRef]
- Chen, X. The Development Trend and Practical Innovation of Smart Cities under the Integration of New Technologies. Front. Eng. Manag. 2019, 6, 485–502. [Google Scholar] [CrossRef]
- Tubis, A.A.; Rohman, J.; Smok, A.; Dopart, D. Analysis of Human Errors in the Traditional and Automated Order-Picking System. In International Conference on Intelligent Systems in Production Engineering and Maintenance; Springer: Berlin/Heidelberg, Germany, 2023; pp. 406–419. [Google Scholar]
- Bogue, R. Growth in E-Commerce Boosts Innovation in the Warehouse Robot Market. Ind. Robot Int. J. 2016, 43, 583–587. [Google Scholar] [CrossRef]
- Barreto, L.; Amaral, A.; Pereira, T. Industry 4.0 Implications in Logistics: An Overview. Procedia Manuf. 2017, 13, 1245–1252. [Google Scholar] [CrossRef]
- Bose, N. Amazon Dismisses Idea Automation Will Eliminate All Its Warehouse Jobs Soon-Reuters; Reuters. 2019. Available online: https://www.reuters.com/article/business/amazon-dismisses-idea-automation-will-eliminate-all-its-warehouse-jobs-soon-idUSKCN1S74B5/ (accessed on 2 May 2019).
- Zhu, S.; Wang, H.; Zhang, X.; He, X.; Tan, Z. A Decision Model on Human-Robot Collaborative Routing for Automatic Logistics. Adv. Eng. Inform. 2022, 53, 101681. [Google Scholar] [CrossRef]
- Boschetti, G.; Sinico, T.; Trevisani, A. Improving Robotic Bin-Picking Performances through Human–Robot Collaboration. Appl. Sci. 2023, 13, 5429. [Google Scholar] [CrossRef]
- Hopko, S.; Wang, J.; Mehta, R. Human Factors Considerations and Metrics in Shared Space Human-Robot Collaboration: A Systematic Review. Front. Robot. AI 2022, 9, 799522. [Google Scholar] [CrossRef]
- Loizaga, E.; Bastida, L.; Sillaurren, S.; Moya, A.; Toledo, N. Modelling and Measuring Trust in Human–Robot Collaboration. Appl. Sci. 2024, 14, 1919. [Google Scholar] [CrossRef]
- Kok, B.C.; Soh, H. Trust in Robots: Challenges and Opportunities. Curr. Robot. Rep. 2020, 1, 297–309. [Google Scholar] [CrossRef] [PubMed]
- Lee, J.; Moray, N. Trust, Control Strategies and Allocation of Function in Human-Machine Systems. Ergonomics 1992, 35, 1243–1270. [Google Scholar] [CrossRef]
- Gao, J.; Lee, J.D. Extending the Decision Field Theory to Model Operators’ Reliance on Automation in Supervisory Control Situations. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 2006, 36, 943–959. [Google Scholar] [CrossRef]
- Akash, K.; Hu, W.-L.; Reid, T.; Jain, N. Dynamic Modeling of Trust in Human-Machine Interactions. In Proceedings of the 2017 American Control Conference (ACC), Seattle, WA, USA, 24–26 May 2017; pp. 1542–1548. [Google Scholar]
- Hu, W.-L.; Akash, K.; Reid, T.; Jain, N. Computational Modeling of the Dynamics of Human Trust during Human–Machine Interactions. IEEE Trans. Hum. Mach. Syst. 2018, 49, 485–497. [Google Scholar] [CrossRef]
- Xu, A.; Dudek, G. Optimo: Online Probabilistic Trust Inference Model for Asymmetric Human-Robot Collaborations. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction, Portland, OR, USA, 2–5 March 2015; pp. 221–228. [Google Scholar]
- Petković, T.; Puljiz, D.; Marković, I.; Hein, B. Human Intention Estimation Based on Hidden Markov Model Motion Validation for Safe Flexible Robotized Warehouses. Robot. Comput. Integr. Manuf. 2019, 57, 182–196. [Google Scholar] [CrossRef]
- Sheng, S.; Pakdamanian, E.; Han, K.; Wang, Z.; Lenneman, J.; Parker, D.; Feng, L. Planning for Automated Vehicles with Human Trust. ACM Trans. Cyber-Phys. Syst. 2022, 6, 1–21. [Google Scholar] [CrossRef]
- Huang, X.; Kwiatkowska, M.; Olejnik, M. Reasoning about Cognitive Trust in Stochastic Multiagent Systems. ACM Trans. Comput. Log. (TOCL) 2019, 20, 1–64. [Google Scholar] [CrossRef]
- De Koster, R.; Le-Duc, T.; Roodbergen, K.J. Design and Control of Warehouse Order Picking: A Literature Review. Eur. J. Oper. Res. 2007, 182, 481–501. [Google Scholar] [CrossRef]
- Caron, F.; Marchet, G.; Perego, A. Layout Design in Manual Picking Systems: A Simulation Approach. Integr. Manuf. Syst. 2000, 11, 94–104. [Google Scholar] [CrossRef]
- Kim, B.; Graves, R.J.; Heragu, S.S.; Onge, A.S. Intelligent Agent Modeling of an Industrial Warehousing Problem. Iie Trans. 2002, 34, 601–612. [Google Scholar] [CrossRef]
- Hausman, W.H.; Schwarz, L.B.; Graves, S.C. Optimal Storage Assignment in Automatic Warehousing Systems. Manag. Sci. 1976, 22, 629–638. [Google Scholar] [CrossRef]
- Ramtin, F.; Pazour, J.A. Product Allocation Problem for an AS/RS with Multiple in-the-Aisle Pick Positions. IIE Trans. 2015, 47, 1379–1396. [Google Scholar] [CrossRef]
- Li, X.; Hua, G.; Huang, A.; Sheu, J.-B.; Cheng, T.C.E.; Huang, F. Storage Assignment Policy with Awareness of Energy Consumption in the Kiva Mobile Fulfilment System. Transp. Res. Part E Logist. Transp. Rev. 2020, 144, 102158. [Google Scholar] [CrossRef]
- Ghelichi, Z.; Kilaru, S. Analytical Models for Collaborative Autonomous Mobile Robot Solutions in Fulfillment Centers. Appl. Math. Model. 2021, 91, 438–457. [Google Scholar] [CrossRef]
- Rey, R.; Cobano, J.A.; Corzetto, M.; Merino, L.; Alvito, P.; Caballero, F. A Novel Robot Co-Worker System for Paint Factories without the Need of Existing Robotic Infrastructure. Robot. Comput. Integr. Manuf. 2021, 70, 102122. [Google Scholar] [CrossRef]
- Baechler, A.; Baechler, L.; Autenrieth, S.; Kurtz, P.; Hoerz, T.; Heidenreich, T.; Kruell, G. A Comparative Study of an Assistance System for Manual Order Picking—Called Pick-by-Projection—With the Guiding Systems Pick-by-Paper, Pick-by-Light and Pick-by-Display. In Proceedings of the 2016 49th Hawaii International Conference on System Sciences (HICSS), Koloa, HI, USA, 5–8 January 2016; pp. 523–531. [Google Scholar]
- Owen, G. Game Theory; Emerald Group Publishing: Bingley, UK, 2013; ISBN 1-78190-508-8. [Google Scholar]
- Hansen, E.A.; Bernstein, D.S.; Zilberstein, S. Dynamic Programming for Partially Observable Stochastic Games. In Proceedings of the AAAI, San Jose, CA, USA, 25–29 July 2004; Volume 4, pp. 709–715. [Google Scholar]
- Leitmann, G. On Generalized Stackelberg Strategies. J. Optim. Theory Appl. 1978, 26, 637–643. [Google Scholar] [CrossRef]
- Kwiatkowska, M.; Norman, G.; Parker, D. PRISM 4.0: Verification of Probabilistic Real-Time Systems. In Proceedings of the Computer Aided Verification: 23rd International Conference, CAV 2011, Snowbird, UT, USA, 14–20 July 2011; Proceedings 23. Springer: Berlin/Heidelberg, Germany, 2011; pp. 585–591. [Google Scholar]
- Forejt, V.; Kwiatkowska, M.; Norman, G.; Parker, D. Automated Verification Techniques for Probabilistic Systems. In Formal Methods for Eternal Networked Software Systems; Bernardo, M., Issarny, V., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2011; Volume 6659, pp. 53–113. ISBN 978-3-642-21454-7. [Google Scholar]
- Chen, M.; Nikolaidis, S.; Soh, H.; Hsu, D.; Srinivasa, S. Trust-Aware Decision Making for Human-Robot Collaboration: Model Learning and Planning. ACM Trans. Hum. Robot Interact. (THRI) 2020, 9, 1–23. [Google Scholar] [CrossRef]
- Glock, C.H.; Grosse, E.H.; Kim, T.; Neumann, W.P.; Sobhani, A. An Integrated Cost and Worker Fatigue Evaluation Model of a Packaging Process. Int. J. Prod. Econ. 2019, 207, 107–124. [Google Scholar] [CrossRef]
- Winkelhaus, S.; Zhang, M.; Grosse, E.H.; Glock, C.H. Hybrid Order Picking: A Simulation Model of a Joint Manual and Autonomous Order Picking System. Comput. Ind. Eng. 2022, 167, 107981. [Google Scholar] [CrossRef]
- Zhang, M.; Grosse, E.H.; Glock, C.H. Ergonomic and economic evaluation of a collaborative hybrid order picking system. Int. J. Prod. Econ. 2023, 258, 108774. [Google Scholar] [CrossRef]
Task Types | |||
---|---|---|---|
10 | 20 | 18 | |
6 | 20 | 16 |
1 | 1 | - | 1 |
Mean | ||||||
---|---|---|---|---|---|---|
Multi-round vs. No-trust | −1.96% | 1.93% | 21.85% | 50.48% | 72.71% | 20.24% |
Multi-round vs. Single-round | 0.64% | 1.17% | 1.26% | 1.67% | 0.5% | 1.05% |
Mean | ||||||
---|---|---|---|---|---|---|
Multi-round vs. No-trust | 78.64% | −40.29% | −73.28% | −68.39% | −52.12% | −59.40% |
Multi-round vs. Single-round | −48.30% | −47.62% | −39.46% | −28.06% | −8.38% | −29.10% |
Mean | ||||||
---|---|---|---|---|---|---|
Multi-round vs. No-trust | 0.23% | −1.52% | −3.04% | −4.36% | −5.18% | −2.77% |
Multi-round vs. Single-round | −0.09% | −0.12% | −0.10% | −0.11% | 0.09% | −0.06% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, Y.; Guo, F.; Ma, Y. A Stackelberg Trust-Based Human–Robot Collaboration Framework for Warehouse Picking. Systems 2025, 13, 348. https://doi.org/10.3390/systems13050348
Liu Y, Guo F, Ma Y. A Stackelberg Trust-Based Human–Robot Collaboration Framework for Warehouse Picking. Systems. 2025; 13(5):348. https://doi.org/10.3390/systems13050348
Chicago/Turabian StyleLiu, Yang, Fuqiang Guo, and Yan Ma. 2025. "A Stackelberg Trust-Based Human–Robot Collaboration Framework for Warehouse Picking" Systems 13, no. 5: 348. https://doi.org/10.3390/systems13050348
APA StyleLiu, Y., Guo, F., & Ma, Y. (2025). A Stackelberg Trust-Based Human–Robot Collaboration Framework for Warehouse Picking. Systems, 13(5), 348. https://doi.org/10.3390/systems13050348