Automatic Fault Detection and Diagnosis in ROS-Based Robotic Systems Using Generative AI: A Systematic Literature Review
Abstract
1. Introduction
2. Methodology
2.1. Review Procedure
- Formulation of the overall research objective and definition of the Research Questions (RQs).
- Development of search queries and selection of relevant data sources.
- Screening of retrieved records based on predefined inclusion and exclusion criteria.
- Full-text assessment of eligible studies and structured data extraction.
- Synthesis, aggregation, and thematic analysis of the extracted data to derive higher-level findings.
2.2. Research Question
- RQ1: What limitations do current FDD approaches in ROS present with respect to automation and human effort?
- RQ2: How do existing monitoring and observability frameworks in ROS support FDD?
- RQ3: How can LLMs enhance automated fault detection, diagnosis, and explanation in ROS-based systems?
- RQ4: What gaps exist in integrating observability with AI-driven diagnosis?
2.3. Databases and Search Queries
- IC1: Studies addressing monitoring, diagnosis, debugging, or observability in ROS-based systems.
- IC2: Studies proposing frameworks, tools, or methodologies for ROS monitoring.
- IC3: Studies focusing on fault detection, anomaly detection, or diagnosis in ROS.
- IC4: Studies integrating LLMs, generative AI, or AI agents with ROS.
- EC1: Studies not directly related to ROS 1 or ROS 2.
- EC2: Studies lacking sufficient technical or methodological detail.
- EC3: Studies whose primary contribution was a domain-specific solution (e.g., FDD tailored exclusively to a particular robot type or task context) without presenting a generalizable framework, method, or finding applicable to ROS-based systems broadly. Studies describing domain-specific deployments that also contributed broadly applicable monitoring frameworks, tool designs, or AI integration patterns were retained.
- Query 1—FDD
| (("Robot Operating System" OR ROS OR ROS 2) AND ("Fault detection and diagnosis" OR "FDD" OR "fault detection" OR "fault diagnosis")) |
- This query yielded 126 results from SpringerLink and 90 results from IEEE Xplore, totaling 216 initial records.
- Query 2—Monitoring and Debugging
| ("Robot Operating System" OR ROS OR ROS 2) AND ("runtime monitoring" OR "real-time monitoring" OR "online monitoring" OR debugging OR "runtime verification") |
- This query yielded 329 results from SpringerLink and 215 results from IEEE Xplore, totaling 544 initial records.
- Query 3—LLM and AI Integration
| (("Robot Operating System" OR "ROS" OR "ROS 2") AND ("large language model" OR "large language models" OR LLM OR "AI agent" OR "AI agents" OR "Agentic AI" OR "AI powered agent")) |
- This query yielded 115 results from SpringerLink and 100 results from IEEE Xplore, totaling 215 initial records.
- Snowballing: Backward and forward citation tracking was performed on key papers identified during the initial screening phase to discover additional relevant studies. This resulted in the identification of 11 additional papers.
- Official Documentation: Key papers from official ROS documentation and the https://ros.org/ website were reviewed to ensure foundational and widely-adopted tools and frameworks were included. This yielded two additional papers.
- Consensus: The Consensus AI-assisted search platform (https://consensus.app) was used in an exploratory capacity during the early stages of the review. It was queried using topic-level phrases such as “fault detection ROS”, “runtime monitoring Robot Operating System”, “LLM robotic systems”, and “observability ROS diagnostics”. The purpose was to acquire broader familiarity with the research landscape, identify emerging terminology, and discover work at the intersection of ROS, FDD, observability, and generative AI. Consensus was not employed as a systematic, reproducible search strategy: its underlying index is not publicly auditable, result ranking is opaque, and the platform does not support Boolean query syntax equivalent to the structured queries used in IEEE Xplore and SpringerLink. Through this process, two studies were identified that had not been captured by the database searches or snowballing: González-Santamarta et al. [30] and Sobrín-Hidalgo et al. [17]. Both were subsequently validated against the predefined ICs and ECs before admission to the final corpus. Their Consensus-sourced origin is acknowledged as a non-reproducible element in Section 6.2.
2.4. Data Extraction
- Monitoring and observability frameworks and tools.
- Diagnostic techniques and automation levels.
- AI-based integration approaches.
- Limitations and future research directions.
2.5. Study Quality Assessment
3. Results
3.1. Taxonomy
3.2. Bibliometic Overview
3.2.1. Temporal Distribution
3.2.2. Publication Type and Venue Distribution
3.2.3. Topic Evolution Across Research Phases
3.2.4. Thematic Clusters and Cross-Cutting Patterns
- Static Analysis and Code Quality (3 studies: [4,11,12]). These studies focus on pre-deployment fault prevention through code mining, architectural analysis, and bug characterization. They share an emphasis on software engineering practices and produce artifacts (e.g., bug taxonomies, code patterns) that could serve as knowledge bases for downstream diagnostic systems.
- Runtime Verification and Formal Monitoring (5 studies: [6,7,8,9,10]). This cluster comprises formal and configuration-driven monitoring frameworks that verify runtime properties. A cross-cutting pattern is the trade-off between specification rigor and usability—formal approaches offer precision but require expertise, while configuration-based tools improve accessibility at the cost of coverage.
- Observability Infrastructure (4 studies: [20,21,31,32]). Studies in this cluster develop low-level tracing, network monitoring, and anomaly detection capabilities. They provide the telemetry foundation upon which higher-level diagnostic reasoning could operate, yet remain disconnected from AI-based interpretation.
- LLM–ROS Integration and Task Execution (7 studies: [13,14,18,19,33,34,35]). This cluster encompasses agentic frameworks that expose ROS primitives to LLMs via tool use, MCP, or Reasoning and Acting (ReAct) patterns. A notable gap is that these systems primarily target task execution and natural language interaction rather than systematic fault detection.
- Explainability and Human Understanding (5 studies: [15,16,17,30,36]). These studies focus on generating human-readable explanations of robot behavior, supporting HRI and operator trust. They demonstrate LLMs’ potential for fault explanation but operate post hoc on logs rather than in real-time diagnostic loops.
4. Fundamental Concepts, Tools and Background
4.1. Robot Operating System
4.2. Native ROS Debugging Tools and Commands
4.3. FDD
4.4. Fault Taxonomy
4.5. Observability
4.6. LLM and Agentic AI
5. Related Work
5.1. FDD in Robotic Systems
5.2. ROS-Based Monitoring Frameworks
5.3. LLM and Agentic-AI-Based Tools
6. Discussion
6.1. Research Directions
6.2. Threats to Validity
- Construct validity: The three Boolean query groups may not capture all relevant terminology in the rapidly evolving fields of agentic AI and LLM-based robotics, where consistent terminology has not yet stabilised. Three complementary strategies—snowballing, official documentation review, and Consensus-assisted exploration—were employed to mitigate this risk. Query execution dates, exact filter configurations, and per-database result counts are documented in Appendix A to support replication.
- Internal validity: Title screening and full-text assessment were performed using the predefined ICs and ECs. Ambiguous cases were discussed with the co-authors until consensus was reached, but inter-rater reliability was not formally quantified. This introduces a risk of subjective inclusion decisions, particularly for studies at the boundary of EC3 (domain-specific applications).
- External validity: IEEE Xplore and SpringerLink were selected as primary sources because they index the dominant venues for ROS-related research, including IEEE Robotics and Automation Letters, IROS, ICRA, the RoSE workshop series, Springer TAROS, and LNCS volumes. Studies published in venues outside IEEE Xplore and SpringerLink that were not reachable through snowballing may have been missed. The 11 papers added via snowballing and the 2 identified through Consensus suggest that the supplementary strategies partially compensated for this gap.
- All search queries, applied filters, and screening criteria are documented within this paper. The full list of included studies is provided in Appendix A, enabling third-party verification of inclusion decisions. Consensus introduces a non-reproducible exploratory element; however, the two papers it surfaced [17,30] were subsequently validated against the predefined ICs and ECs before admission to the corpus, and are clearly identified as Consensus-sourced in the methodology.
- Timeliness: The database searches were executed in December 2025, representing a fixed snapshot of the literature at that point. The LLM–robotics integration field evolves rapidly, and studies published after December 2025 are outside this review’s scope. The rapid pace of development also means that some tools referenced may have evolved since inclusion.
- Scope: This study focuses on software-level architectural and observational aspects of FDD in ROS-based systems. Hardware–software interaction fault modes and embodied mechanical intelligence approaches are explicitly outside the scope of this review and are identified as complementary directions for future work. Furthermore, inference latency under ROS 2 timing constraints is not systematically reported in the reviewed literature, revealing an open empirical gap that warrants dedicated benchmarking in future studies.
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. List and Categorization of Included Studies
| Ref. | Title | Authors | Year | Venue | T. | Country | DOI | Cit. | QS | QR |
|---|---|---|---|---|---|---|---|---|---|---|
| [11] | Mining the Usage Patterns of ROS Primitives | Santos, A. et al. | 2017 | IROS | C | Portugal | https://doi.org/10.1109/IROS.2017.8206237 | 33 | 7 | H |
| [5] | On Fault Detection and Diagnosis in Robotic Systems | Khalastchi, E., Kalech, M. | 2018 | ACM Comput. Surv. | J | Israel | https://doi.org/10.1145/3146389 | 215 | 9 | H |
| [6] | ROSMonitoring: A Runtime Verification Framework for ROS | Ferrando, A. et al. | 2020 | TAROS (LNCS) | C | UK | https://doi.org/10.1007/978-3-030-63486-5_40 | 87 | 8 | H |
| [31] | ROS-FM: Fast Monitoring for the Robotic Operating System | Rivera, S. et al. | 2020 | ICECCS | C | Luxembourg | https://doi.org/10.1109/ICECCS51672.2020.00029 | 25 | 6 | M |
| [36] | Online Monitoring and Visualization with ROS and ReactJS | Ivanov, A. et al. | 2021 | SIBCON | C | Russia | https://doi.org/10.1109/SIBCON50419.2021.9438890 | 10 | 5 | M |
| [12] | The High-Assurance ROS Framework | Santos, A. et al. | 2021 | RoSE @ ICSE | W | Portugal | https://doi.org/10.1109/RoSE52553.2021.00013 | 25 | 6 | M |
| [8] | Towards Flexible Runtime Monitoring Support for ROS-based Applications | Stadler, M. et al. | 2022 | RoSE @ ICSE | W | Austria | https://doi.org/10.1145/3526071.3527515 | 6 | 6 | M |
| [21] | ros2_tracing: Multipurpose Low-Overhead Framework for Real-Time Tracing of ROS 2 | Bédard, C. et al. | 2022 | IEEE RA-L | J | Canada | https://doi.org/10.1109/LRA.2022.3174346 | 77 | 10 | H |
| [20] | An Empirical Study on Fault Diagnosis in Robotic Systems | Song, X. et al. | 2023 | ICSME | C | China | https://doi.org/10.1109/ICSME58846.2023.00030 | 3 | 7 | H |
| [9] | ROMoSu: Flexible Runtime Monitoring Support for ROS-based Applications | Stadler, M., Vierhauser, M. | 2023 | RoSE @ ICSE | W | Austria | https://doi.org/10.1109/RoSE59155.2023.00013 | 2 | 6 | M |
| [26] | Large Language Models for Robotics: A Survey | Zeng, F. et al. | 2023 | arXiv | P | China | https://doi.org/10.48550/arXiv.2311.07226 | 291 | 6 | M |
| [30] | Using Large Language Models for Interpreting Autonomous Robots Behaviors | González-Santamarta, M. Á. et al. | 2023 | HAIS (LNCS) | C | Spain | https://doi.org/10.1007/978-3-031-40725-3_45 | 19 | 7 | H |
| [2] | Impact of ROS 2 Node Composition in Robotic Systems | Macenski, S. et al. | 2023 | IEEE RA-L | J | USA | https://doi.org/10.1109/LRA.2023.3279614 | 109 | 10 | H |
| [4] | ROBUST: 221 Bugs in the Robot Operating System | Timperley, C. S. et al. | 2024 | Empir. Softw. Eng. | J | USA | https://doi.org/10.1007/s10664-024-10440-0 | 11 | 8 | H |
| [7] | ROSMonitoring 2.0: Extending ROS Runtime Verification to Services and Ordered Topics | Saadat, M. G. et al. | 2024 | FMAS (EPTCS) | W | UK | https://doi.org/10.4204/EPTCS.411.3 | 4 | 5 | M |
| [10] | Runtime Verification and Field-based Testing for ROS-based Robotic Systems | Caldas, R. et al. | 2024 | IEEE TSE | J | Sweden | https://doi.org/10.1109/TSE.2024.3444697 | 35 | 10 | H |
| [28] | Advances in Large Language Models for Robotics | Qi, Z., Jing, X. | 2024 | ICMRA | C | China | https://doi.org/10.1109/ICMRA62519.2024.10809099 | 4 | 6 | M |
| [15] | Explaining Robot Failures in ROS using Parameter-Efficient Fine-Tuning | Scheltinga, E., Pek, C. | 2024 | RSS Workshop | W | The Netherlands | N/A | N/A | 5 | M |
| [16] | Personalising Explanations for Robot Failures in ROS using PEFT | Scheltinga, E. M. | 2024 | TU Delft | T. | The Netherlands | N/A | N/A | 3 | L |
| [17] | Explaining Autonomy: Enhancing HRI through Explanation Generation with LLMs | Sobrín-Hidalgo, D. et al. | 2024 | arXiv | P | Spain | https://doi.org/10.48550/arXiv.2402.04206 | 28 | 6 | M |
| [34] | OperateLLM: Integrating ROS Tools in Large Language Models | Raja, A., Bhethanabotla, A. | 2024 | ICoCET | C | USA | https://doi.org/10.1109/ICoCET63343.2024.10730448 | 4 | 6 | M |
| [19] | Bagel | Extelligence-ai | 2024 | GitHub | Tool | – | N/A | N/A | 3 | L |
| [35] | ROS-LLM: A ROS Framework for Embodied AI with Task Feedback | Mower, C. E. et al. | 2024 | arXiv | P | UK | https://doi.org/10.48550/arXiv.2406.19741 | 54 | 6 | M |
| [27] | Large Language Models for Robotics: Opportunities, Challenges, and Perspectives | Wang, J. et al. | 2025 | J. Autom. Intell. | J | China | https://doi.org/10.1016/j.jai.2024.12.003 | 381 | 9 | H |
| [13] | Enabling Novel Mission Operations and Interactions with ROSA | Royce, R. et al. | 2025 | IEEE Aerosp. Conf. | C | USA | https://doi.org/10.1109/AERO63441.2025.11068426 | 23 | 9 | H |
| [14] | ROS Help Desk: GenAI Powered Framework for ROS Error Diagnosis | Katuwandeniya, K. et al. | 2025 | arXiv | P | Australia | https://doi.org/10.48550/arXiv.2507.07846 | N/A | 3 | L |
| [33] | RAI: Flexible Agent Framework for Embodied AI | Rachwał, K. et al. | 2025 | PAAMS | C | Poland | https://doi.org/10.1007/978-3-032-05925-3_16 | 6 | 7 | H |
| [18] | ROSBag MCP Server: Analyzing Robot Data with LLMs | Fu, L. et al. | 2025 | RoboticCC | C | Italy | https://doi.org/10.1109/RoboticCC68732.2025.00025 | 1 | 6 | M |
| [32] | Watch Your Callback: Offline Anomaly Detection Using ML in ROS 2 | Kang, J. et al. | 2025 | IEEE Access | J | S. Korea | https://doi.org/10.1109/ACCESS.2025.3556864 | 6 | 7 | H |
| Ref. | Summary | Keywords | ROS | Categories & Sub-Tags |
|---|---|---|---|---|
| [11] | Data mining of ROS code repositories to identify fault-prone patterns. | ROS, static analysis, code patterns, fault prevention, software quality | 1 | Static Analysis/QA: Code Quality, Bug Mining |
| [5] | Comprehensive survey of FDD approaches across robotic system types. | fault detection, fault diagnosis, robotic systems, survey, model-based, data-driven | Ag. | FDD: Data-Driven, Model-Based, Knowledge-Based, Hybrid |
| [6] | Modular RV framework for monitoring inter-node communication via formal specifications. | ROS, runtime verification, formal specification, monitoring, safety-critical | 1 | Monitoring: Runtime Verification |
| [31] | Low-overhead network-level monitoring using eBPF and XDP for ROS traffic. | ROS, monitoring, eBPF, XDP, network security | 1 | Monitoring: Network-Level; Observability: Metrics |
| [36] | Browser-based remote visualization solution for ROS system monitoring. | ROS, online monitoring, visualization, ReactJS | 1 | Monitoring: Configuration-Based |
| [12] | Static analysis framework for code quality and architectural pattern detection. | ROS, static analysis, HAROS, code quality, architectural analysis | Both | Static Analysis/QA: Code Quality, Architectural Analysis |
| [8] | Configuration-driven runtime monitoring for ROS-based applications. | ROS, runtime monitoring, configuration, flexibility | Both | Monitoring: Runtime Verification, Configuration-Based |
| [21] | Low-overhead tracing framework for ROS 2 execution events and timing analysis. | ROS 2, tracing, LTTng, instrumentation, real-time, performance evaluation | 2 | Observability: Tracing |
| [20] | Empirical study showing traces and trajectories improve fault diagnosis accuracy. | ROS, fault diagnosis, tracing, empirical study | 1 | FDD: Hybrid; Observability: Logging, Tracing |
| [9] | Flexible configuration-driven monitoring replacing rigid formal specifications. | ROS, monitoring, configuration, runtime | Both | Monitoring: Runtime Verification, Configuration-Based |
| [26] | Survey of LLM applications in robot control, perception, and planning. | LLM, robotics, survey, planning, control | Ag. | LLM/Generative AI: Task Execution |
| [30] | Evaluates LLMs interpreting raw ROS 2 logs without prompt engineering. | LLM, ROS 2, log interpretation, autonomous robots, explainability | 2 | LLM/Gen. AI: Log Interpretation; HRI/Expl.: Failure Expl. |
| [2] | Benchmarks ROS 2 component node composition for performance optimization. | ROS 2, node composition, performance, benchmarking | 2 | ROS Architecture: ROS 2, Node Composition |
| [4] | Dataset characterizing 221 bugs from seven major ROS projects. | robotics, software bugs, dataset, Robot Operating System | Both | Bugs/Fault Taxonomy: Bug Dataset, Fault Classification |
| [7] | Extends ROSMonitoring with service monitoring and message ordering. | ROS, runtime verification, services, message ordering | Both | Monitoring: Runtime Verification |
| [10] | Guidelines for ROS observability, instrumentation, and field-based testing. | ROS 2, runtime verification, field testing, instrumentation, observability | 2 | Monitoring: Runtime Verif.; Observability: Logging, Tracing |
| [28] | Survey of recent LLM advances for robotic applications. | LLM, robotics, survey, advances | Ag. | LLM/Generative AI: Task Execution |
| [15] | PEFT transforms raw ROS logs into human-readable failure narratives. | ROS, LLM, PEFT, LoRA, failure explanation | 2 | LLM/Gen. AI: Fault Expl.; RAG/MCP: Fine-Tuning; HRI/Expl.: Failure Expl. |
| [16] | Extends failure explanations with personalization based on user expertise. | ROS, LLM, PEFT, personalization, explainability | 2 | LLM/Gen. AI: Fault Expl.; RAG/MCP: Fine-Tuning; HRI/Expl.: Personalization |
| [17] | Uses RAG with LLMs for context-aware robot behavior explanations. | robotics, HRI, explainability, LLM, RAG, autonomous robots | 2 | LLM/Gen. AI: Log Interp.; RAG/MCP: RAG; HRI/Expl.: Failure Expl. |
| [34] | Enables LLMs to dynamically generate and execute ROS nodes via rclpy. | ROS, LLM, ReAct, code generation, rclpy | 2 | Agentic AI: ReAct, Tool Use; LLM/Gen. AI: Task Execution |
| [19] | MCP-based tool for natural-language log analysis of ROS bag data. | ROS, MCP, bag files, LLM, diagnostics | 2 | LLM/Gen. AI: Log Interp.; RAG/MCP: MCP; Observability: Logging |
| [35] | Exposes ROS Actions and Services as LLM tools with behavior trees. | ROS, LLM, embodied AI, behavior trees, state machines, task feedback | 2 | Agentic AI: Tool Use, Behavior Trees; LLM/Gen. AI: Task Execution |
| [27] | Survey of LLM opportunities and challenges in robotic systems. | LLM, robotics, survey, challenges, opportunities | Ag. | LLM/Generative AI: Task Execution |
| [13] | NASA’s natural language interface for ROS via ReAct and LangChain. | ROS, ROSA, LLM, ReAct, LangChain, NASA | Both | Agentic AI: ReAct, Tool Use; HRI/Expl.: NL Interface |
| [14] | Proactive error detection via continuous log/sensor monitoring with LLM diagnosis. | ROS, LLM, RAG, error diagnosis, debugging, explainability, GenAI | 2 | FDD: Knowledge-Based; LLM/Gen. AI: Fault Expl., Code Review; RAG/MCP: RAG; Agentic AI: ReAct |
| [33] | Multi-agent framework treating sensors and actuators as agent capabilities. | ROS 2, multi-agent, embodied AI, RAG | 2 | Agentic AI: Multi-Agent; LLM/Gen. AI: Task Execution; RAG/MCP: RAG |
| [18] | MCP server enabling natural-language interaction with ROS bag files. | ROS, MCP, LLM, VLM, rosbag, agentic AI | 2 | RAG/MCP: MCP; LLM/Gen. AI: Log Interp.; Agentic AI: Tool Use |
| [32] | ML-based offline anomaly detection from ROS 2 callback trace data. | ROS 2, anomaly detection, unsupervised learning, callbacks, fault injection, tracing | 2 | FDD: Data-Driven; Monitoring: Anomaly Detection; Observability: Tracing |
References
- Macenski, S.; Foote, T.; Gerkey, B.; Lalancette, C.; Woodall, W. Robot operating system 2: Design, architecture, and uses in the wild. Sci. Robot. 2022, 7, eabm6074. [Google Scholar] [CrossRef] [PubMed]
- Macenski, S.; Soragna, A.; Carroll, M.; Ge, Z. Impact of ROS 2 node composition in robotic systems. IEEE Robot. Autom. Lett. 2023, 8, 3996–4003. [Google Scholar] [CrossRef]
- Quigley, M.; Conley, K.; Gerkey, B.; Faust, J.; Foote, T.; Leibs, J.; Ng, A.; Wheeler, R. ROS: An open-source Robot Operating System. In Proceedings of the IEEE International Conference on Robotics and Automation Workshop on Open Source Software, Kobe, Japan, 12–17 May 2009. [Google Scholar]
- Timperley, C.S.; van der Hoorn, G.; Santos, A.; Deshpande, H.; Wąsowski, A. ROBUST: 221 bugs in the Robot Operating System. Empir. Softw. Eng. 2024, 29, 57. [Google Scholar] [CrossRef]
- Khalastchi, E.; Kalech, M. On fault detection and diagnosis in robotic systems. ACM Comput. Surv. 2018, 51, 1–24. [Google Scholar] [CrossRef]
- Ferrando, A.; Cardoso, R.C.; Fisher, M.; Ancona, D.; Franceschini, L.; Mascardi, V. ROSMonitoring: A runtime verification framework for ROS. In Proceedings of the Annual Conference Towards Autonomous Robotic Systems; Springer International Publishing: Cham, Switzerland, 2020; pp. 387–399. [Google Scholar] [CrossRef]
- Saadat, M.G.; Ferrando, A.; Dennis, L.A.; Fisher, M. ROSMonitoring 2.0: Extending ROS runtime verification to services and ordered topics. Electron. Proc. Theor. Comput. Sci. (EPTCS) 2024, 411, 17–31. [Google Scholar] [CrossRef]
- Stadler, M.; Vierhauser, M.; Cleland-Huang, J. Towards flexible runtime monitoring support for ROS-based applications. In Proceedings of the 4th International Workshop on Robotics Software Engineering (RoSE), Pittsburgh, PA, USA, 9 May 2022; pp. 43–46. [Google Scholar] [CrossRef]
- Stadler, M.; Vierhauser, M. ROMoSu: Flexible runtime monitoring support for ROS-based applications. In Proceedings of the IEEE/ACM 5th International Workshop on Robotics Software Engineering (RoSE), Melbourne, Australia, 15 May 2023; pp. 53–60. [Google Scholar] [CrossRef]
- Caldas, R.; García, J.A.P.; Schiopu, M.; Pelliccione, P.; Rodrigues, G.; Berger, T. Runtime verification and field-based testing for ROS-based robotic systems. IEEE Trans. Softw. Eng. 2024, 50, 2544–2567. [Google Scholar] [CrossRef]
- Santos, A.; Cunha, A.; Macedo, N.; Arrais, R.; Dos Santos, F.N. Mining the usage patterns of ROS primitives. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 3855–3860. [Google Scholar] [CrossRef]
- Santos, A.; Cunha, A.; Macedo, N. The high-assurance ROS framework. In Proceedings of the 2021 IEEE/ACM 3rd International Workshop on Robotics Software Engineering (RoSE), Madrid, Spain, 2 June 2021; pp. 37–40. [Google Scholar] [CrossRef]
- Royce, R.; Kaufmann, M.; Becktor, J.; Moon, S.; Carpenter, K.; Pak, K.; Towler, A.; Thakker, R.; Khattak, S. Enabling novel mission operations and interactions with ROSA: The robot operating system agent. In Proceedings of the 2025 IEEE Aerospace Conference, Big Sky, MT, USA, 1–8 March 2025; pp. 1–16. [Google Scholar] [CrossRef]
- Katuwandeniya, K.; Widhanapathirana, S.R.J. ROS Help Desk: GenAI powered, user-centric framework for ROS error diagnosis and debugging. arXiv 2025, arXiv:2507.07846. [Google Scholar] [CrossRef]
- Scheltinga, E.; Pek, C. Explaining robot failures in ROS using parameter-efficient fine-tuning. In Proceedings of the Workshop on Robot Execution Failures and Failure Management Strategies, RSS 2024, Delft, The Netherlands, 15–19 July 2024; Available online: https://robot-failures.github.io/rss2024/papers/RobotFailuresRSS2024_paper_2.pdf (accessed on 18 December 2025).
- Scheltinga, E.M. Personalising Explanations for Robot Failures in Robot Operating System Using Parameter-Efficient Fine-Tuning. Master’s Thesis, Department of Mechanical Engineering, TU Delft, Delft, The Netherlands, 2024. [Google Scholar]
- Sobrín-Hidalgo, D.; González-Santamarta, M.A.; Guerrero-Higueras, Á.M.; Rodríguez-Lera, F.J.; Matellán-Olivera, V. Explaining autonomy: Enhancing human-robot interaction through explanation generation with large language models. arXiv 2024, arXiv:2402.04206. [Google Scholar] [CrossRef]
- Fu, L.; Salimpour, S.; Militano, L.; Edelman, H.; Queralta, J.P.; Toffetti, G. ROSBag MCP Server: Analyzing robot data with LLMs for agentic embodied AI applications. In Proceedings of the 2025 International Conference on Robotic Computing and Communication (RoboticCC), Naples, Italy, 8–10 December 2025; pp. 70–77. [Google Scholar] [CrossRef]
- Extelligence-ai/Bagel. GitHub, 2024. Available online: https://github.com/Extelligence-ai/bagel (accessed on 18 December 2025).
- Song, X.; Li, Y.; Dong, Z.; Liu, S.; Cao, J.; Peng, X. An empirical study on fault diagnosis in robotic systems. In Proceedings of the 2023 IEEE International Conference on Software Maintenance and Evolution (ICSME), Bogotá, Colombia, 1–6 October 2023; pp. 207–219. [Google Scholar] [CrossRef]
- Bédard, C.; Lütkebohle, I.; Dagenais, M. ros2_tracing: Multipurpose low-overhead framework for real-time tracing of ROS 2. IEEE Robot. Autom. Lett. 2022, 7, 6511–6518. [Google Scholar] [CrossRef]
- Datadog. Modern Monitoring & Security. Datadog, Inc. Available online: https://www.datadoghq.com/ (accessed on 18 December 2025).
- New Relic. Observability Platform. New Relic, Inc. Available online: https://newrelic.com (accessed on 18 December 2025).
- OpenTelemetry. OpenTelemetry: Effective Observability Requires High-Quality Telemetry. Cloud Native Computing Foundation. Available online: https://opentelemetry.io/ (accessed on 18 December 2025).
- What Is the Model Context Protocol (MCP)? Model Context Protocol Documentation. Available online: https://modelcontextprotocol.io/docs/getting-started/intro (accessed on 18 December 2025).
- Zeng, F.; Gan, W.; Wang, Y.; Liu, N.; Yu, P.S. Large language models for robotics: A survey. arXiv 2023, arXiv:2311.07226. [Google Scholar] [CrossRef]
- Wang, J.; Shi, E.; Hu, H.; Ma, C.; Liu, Y.; Wang, X.; Yao, Y.; Liu, X.; Ge, B.; Zhang, S. Large language models for robotics: Opportunities, challenges, and perspectives. J. Autom. Intell. 2025, 4, 52–64. [Google Scholar] [CrossRef]
- Qi, Z.; Jing, X. Advances in large language models for robotics. In Proceedings of the 2024 7th International Conference on Mechatronics, Robotics and Automation (ICMRA), Wuhan, China, 20–22 September 2024; pp. 72–76. [Google Scholar] [CrossRef]
- PRISMA. PRISMA 2020 Statement. 2021. Available online: https://www.prisma-statement.org/prisma-2020 (accessed on 18 December 2025).
- González-Santamarta, M.Á.; Fernández-Becerra, L.; Sobrín-Hidalgo, D.; Guerrero-Higueras, Á.M.; González, I.; Lera, F.J.R. Using large language models for interpreting autonomous robots behaviors. In Proceedings of the International Conference on Hybrid Artificial Intelligence Systems; Springer Nature: Cham, Switzerland, 2023; pp. 533–544. [Google Scholar] [CrossRef]
- Rivera, S.; Iannillo, A.K.; Lagraa, S.; Joly, C.; State, R. ROS-FM: Fast monitoring for the robotic operating system (ROS). In Proceedings of the 2020 25th International Conference on Engineering of Complex Computer Systems (ICECCS), Singapore, 28–31 October 2020; pp. 187–196. [Google Scholar] [CrossRef]
- Kang, J.; Kim, K.; Kwon, D. Watch your callback: Offline anomaly detection using machine learning in ROS 2. IEEE Access 2025, 13, 60763–60775. [Google Scholar] [CrossRef]
- Rachwał, K.; Majek, M.; Boczek, B.; Dąbrowski, K.; Liberadzki, P.; Dąbrowski, A.; Ganzha, M. RAI: Flexible agent framework for embodied AI. In Proceedings of the International Conference on Practical Applications of Agents and Multi-Agent Systems; Springer Nature: Cham, Switzerland, 2025; pp. 195–206. [Google Scholar] [CrossRef]
- Raja, A.; Bhethanabotla, A. OperateLLM: Integrating robot operating system (ROS) tools in large language models. In Proceedings of the 2024 IEEE 1st International Conference on Communication Engineering and Emerging Technologies (ICoCET), Kepala Batas, Malaysia, 2–3 September 2024; pp. 1–4. [Google Scholar] [CrossRef]
- Mower, C.E.; Wan, Y.; Yu, H.; Grosnit, A.; Gonzalez-Billandon, J.; Zimmer, M.; Wang, J.; Zhang, X.; Zhao, Y.; Zhai, A.; et al. ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning. arXiv 2024, arXiv:2406.19741. [Google Scholar] [CrossRef]
- Ivanov, A.; Zakiev, A.; Tsoy, T.; Hsia, K.H. Online monitoring and visualization with ROS and ReactJS. In Proceedings of the 2021 International Siberian Conference on Control and Communications (SIBCON), Kazan, Russia, 13–15 May 2021; pp. 1–4. [Google Scholar] [CrossRef]
- ISO/IEC/IEEE 24765:2010; Systems and Software Engineering—Vocabulary. International Organization for Standardization; International Electrotechnical Commission; Institute of Electrical and Electronics Engineers; International Standard: Geneva, Switzerland, 2010. Available online: https://www.iso.org/standard/50518.html (accessed on 18 December 2025).
- Richardson, C. Microservices Patterns: With Examples in Java; Simon & Schuster: New York, NY, USA, 2018. [Google Scholar]
- Zipkin. Distributed Tracing System. Available online: https://zipkin.io/ (accessed on 18 December 2025).
- Bo, V.; Garrell, A.; Sanfeliu, A. Fast or accurate? How intention-recognition models shape human perception of a mobile robot. In Proceedings of the Companion Proceedings of the 21st ACM/IEEE International Conference on Human-Robot Interaction, Scotland, UK, 16–19 March 2026; pp. 502–506. [Google Scholar] [CrossRef]



| Criterion | Category | Score |
|---|---|---|
| Peer-review status | Yes/No | 2/0 |
| Venue tier | High/Medium/Low | 3/2/1 |
| Age-normalised citations | High/Medium/Low | 3/1/0 |
| Publication recency | Recent (≥2022)/Not recent | 2/1 |
| Age Bracket | Years | High (+3) | Medium (+1) | Low (+0) |
|---|---|---|---|---|
| Very recent (<1 yr) | 2025 | ≥10 | 3–9 | <3 |
| Recent (1–2 yr) | 2023–2024 | ≥25 | 8–24 | <8 |
| Established (3–4 yr) | 2021–2022 | ≥50 | 15–49 | <15 |
| Mature (≥5 yr) | ≤2020 | ≥80 | 20–79 | <20 |
| Category | Key Elements and Definitions |
|---|---|
| Fault Origin | Hardware, Software, Interaction |
| Fault Type | Recoverable vs. Non-Recoverable |
| FDD Approach | Data-Driven, Model-Based, Knowledge-Based, Hybrid |
| Fault Diagnosis | Proactive: system-initiated; Reactive: user-initiated or triggered; Preventive: anticipates faults before occurrence; Corrective: responds after fault |
| Fault Detection | Online: real-time monitoring; Offline: post hoc analysis |
| Observability | Logs, Traces, Metrics, Sensor Readings |
| Monitoring | Alerts: automated notifications; Dashboards: visual inspection interfaces |
| Analysis Type | Dynamic: analysis during real-time execution; Static: analysis on recorded/historical data or source code |
| Verification | RV, Simulation-based, Field-based |
| AI Integration | LLMs, Agents |
| Automation | Manual: continuous human intervention required for both detection and diagnosis; Semi-automated: automated detection with human-guided diagnosis, or automated monitoring that requires significant expert configuration; Fully Automated: autonomous detection, diagnosis, and explanation without human input per fault event |
| Publication Type | Count | Publisher | Count |
|---|---|---|---|
| Conference Paper | 11 | IEEE | 14 |
| Journal Article | 7 | arXiv | 4 |
| Workshop Paper | 5 | Springer | 4 |
| arXiv Preprint | 4 | ACM | 2 |
| Thesis | 1 | Elsevier | 1 |
| Software/Tool | 1 | Other | 4 |
| Thematic Category | Foundation | Maturation | AI Integration |
|---|---|---|---|
| 2017–2018 | 2020–2022 | 2023–2025 | |
| FDD/Bug Taxonomy | 2 | 1 | 1 |
| Static Analysis/QA | 1 | 1 | 0 |
| Runtime Monitoring/RV | 0 | 4 | 1 |
| Observability (Tracing/Metrics) | 0 | 2 | 1 |
| ROS Architecture | 0 | 1 | 1 |
| LLM/Agentic AI | 0 | 0 | 13 |
| LLM Survey | 0 | 0 | 3 |
| HRI/Explainability | 0 | 0 | 2 |
| Framework | Ref. | Primary Purpose | Key Technologies | Detection Mode | Diagnosis Mode | Observability | Monitoring | Analysis Type | Fault Strategy |
|---|---|---|---|---|---|---|---|---|---|
| ROS-LLM | [35] | Natural language task execution | DeepSeek 7B; CoT; Few-shot; Tools | None | None | None | None | Dynamic | None |
| OperateLLM | [34] | Development-time ROS interaction | LLMs; ReAct; ROS Tools | None | None | None | None | Dynamic | None |
| RAI | [33] | Multi-agent embodied AI framework | LLMs; RAG; Multi-Agents; ROS Tools; LangChain | Online | None | None | Logs, Sensor Readings | Dynamic | Both |
| ROSA | [13] | Natural language interface for ROS operations | LLMs; ReAct; ROS Tools; LangChain | Online | Reactive | Logs, Sensors, Topics | Dashboards | Dynamic | Both |
| ROS Help Desk | [14] | Error detection and debugging support | LLMs; RAG; ReAct; LangChain; ROS Tools; Gradio | Online | Proactive | Logs, Sensors, Topics, Source Code | Dashboards | Dynamic | Both |
| ROSBag MCP | [18] | Rosbag data analysis via natural language | LLMs/VLMs; MCP; ROS Tools | Offline | Reactive | Logs, Sensors, Topics | Dashboards | Static | Corrective |
| Bagel | [19] | Analyze data to provide informed answers | LLMs, MCP | Online | Reactive | Logs, Sensors, Topics, Metadata | Dashboards | Static | Both |
| Scheltinga et al. | [15,16] | Failure explanation generation for navigation | LLMs; RAG; PEFT; Low-Rank Adaptation (LoRA) | None | Reactive | Logs | None | Static | Corrective |
| González-S. et al. | [30] | Robot behaviour based on log interpretation | Generic LLM | None | Reactive | Logs | None | Static | None |
| Sobrín-H. et al. | [17] | Robot behaviour based on log interpretation | LLMs; RAG | None | Reactive | Logs | None | Static | None |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Cardoso, M.; Arrais, R.; Sousa, A. Automatic Fault Detection and Diagnosis in ROS-Based Robotic Systems Using Generative AI: A Systematic Literature Review. Appl. Sci. 2026, 16, 5545. https://doi.org/10.3390/app16115545
Cardoso M, Arrais R, Sousa A. Automatic Fault Detection and Diagnosis in ROS-Based Robotic Systems Using Generative AI: A Systematic Literature Review. Applied Sciences. 2026; 16(11):5545. https://doi.org/10.3390/app16115545
Chicago/Turabian StyleCardoso, Marta, Rafael Arrais, and Armando Sousa. 2026. "Automatic Fault Detection and Diagnosis in ROS-Based Robotic Systems Using Generative AI: A Systematic Literature Review" Applied Sciences 16, no. 11: 5545. https://doi.org/10.3390/app16115545
APA StyleCardoso, M., Arrais, R., & Sousa, A. (2026). Automatic Fault Detection and Diagnosis in ROS-Based Robotic Systems Using Generative AI: A Systematic Literature Review. Applied Sciences, 16(11), 5545. https://doi.org/10.3390/app16115545

