Next Article in Journal
Antenna-Pattern Radiometric Correction for Mini-RF S-Band SAR Imagery in Lunar Polar Regions
Previous Article in Journal
Differential Effects of Low and High Caffeine Doses on Bench Press Muscular Endurance: A Randomized, Double-Blind, Placebo-Controlled Crossover Study
Previous Article in Special Issue
Quantum-Informed Cybernetics for Collective Intelligence in IoT Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

LLM-Enhanced Control of a Mobile Robotic Platform for Smart Industry

by
Mihai-Daniel Pavel
1,
Grigore Stamatescu
1,2,*,
Marek Chodnicki
3 and
Catalin Gheorghe Amza
4
1
Asti Automation, 139 Calea Plevnei, 060011 Bucharest, Romania
2
Department of Automation and Industrial Informatics, National University of Science and Technology Politehnica of Bucharest, 313 Splaiul Independentei, 060042 Bucharest, Romania
3
Faculty of Mechanical Engineering and Ship Technology, Gdansk University of Technology, 11/12 Gabriela Narutowicza Street, 80233 Gdansk, Poland
4
Faculty of Industrial Engineering and Robotics, National University of Science and Technology Politehnica of Bucharest, 313 Splaiul Independentei, 060042 Bucharest, Romania
*
Author to whom correspondence should be addressed.
Appl. Sci. 2026, 16(4), 1680; https://doi.org/10.3390/app16041680
Submission received: 23 December 2025 / Revised: 3 February 2026 / Accepted: 4 February 2026 / Published: 7 February 2026

Abstract

The emergence of highly complex generative AI and large language models represents both a significant challenge and an opportunity for multiple engineering domains. Under the Industry 4.0 paradigm, various connected automation and industrial engineering applications can leverage the inference and generative design capabilities of these models to improve control algorithms and systems. In particular, widespread deployment of mobile robotic platforms in modern industry, enhanced with LLM capabilities, can provide a substantial increase in the efficiency and cost-effectiveness of such solutions. In this study, we investigate the suitability of current-generation LLM systems for industrial mobile robot control. We present a systematic, end-to-end methodology for benchmarking four GenAI/LLMs, SmolLM2, Llama 3.2, Gemma3, and Gemma3-qat, for a typical mobile robot platform configuration. The approach is two-staged, based on both assessing the specific domain knowledge of the models in an industrial context and their integration with a robotic simulation environment based on ROS2. Reported results focus on quantitative assessment of multiple metrics (quality, coverage, speed, and reliability) and their integration in aggregated scoring mechanisms, which can help developers select and adapt the best model for a particular application, together with custom software implementation.

1. Introduction

As mentioned in Wang et al. [1], the rising expectations in robotics are closely tied to the rapid evolution of large language models (LLMs). As LLMs continue to evolve and accumulate grounded knowledge from multimodal data, they are expected to play a pivotal role in the next generation of intelligent robotic systems. By addressing the existing challenges through collaborative and interdisciplinary efforts, the full potential of LLMs in robotics can be unlocked, leading to advancements that could profoundly impact various industries and aspects of daily life Zeng et al. [2]. We build upon this vision by integrating LLMs Liu et al. [3] into an industrial mobile robot logistics application, where the models act as ROS-based expert agents that help end-users monitor and control the mobile platform through natural language.
Following the work of Okumuş et al. [4], which emphasizes that simulation-based experiments are well suited to demonstrate the theoretical applicabillity of novel control approaches, we evaluated our LLM-enhanced control solution both on the physical ASTIbot platform and in a ROS2-based simulation environment.
In their recent survey, the authors of Bernardo et al. [5] argue that AI strategies are increasingly being incorporated into robots, with a growing emphasis on semantic-knowledge-based approaches that allow robots to understand not only spatial aspects of the environment but also the meaning of each element and how humans interact with them.
Large industrial organizations, such as SIEMENS AG and Beckhoff Automation, are actively deploying many types of software agents across various areas of industrial automation Rainer Jr. et al. [6], from software development assistants to self-diagnostics and self-maintenance software agents. Human-centric automation paradigms Dong et al. [7] also lead to the integration of such software agents in complex frameworks of LLM-based solutions, including applications in prognostics and maintenance Vidyaratne et al. [8].
We consider two types of robotic systems for the purpose of our study: hardware and software robots. Hardware or physical robots are tangible systems composed of a multitude of sensors and actuators designed to manipulate or execute a predefined task in a variety of work environments while collaborating with human operators Sherwani et al. [9]. Software or virtual robots are those entities that operate entirely in digital environments and simulations and consist of algorithms, and even ML algorithms, that allow them to imitate human actions Agostinelli et al. [10].

1.1. Hardware Robots

Physical mobile robots, such as autonomous mobile robots (AMRs) and automated guided vehicles (AGVs), are now a standard element in many smart factories and warehouses Pavel et al. [11]. They move materials between storage, production lines, and shipping areas, and they can also perform routine inspection or inventory tasks. Their main advantage is that they execute repetitive transport and handling operations with high precision and consistent timing, which supports stable and predictable production flows.
Compared to manual handling, mobile robots can operate continuously, with no need for breaks or shift changes. This enables higher throughput and smoother coordination with production schedules, especially in multi-shift or 24/7 environments Raj and Kos [12]. They can also handle heavy loads and work in areas that may be uncomfortable or unsafe for human workers, such as narrow aisles, cold storage, or zones with moving machinery. This can reduce workplace accidents and ergonomic strain.
The introduction of mobile robots requires a significant initial investment in hardware, infrastructure adaptation, and integration with existing IT and production systems. However, once deployed, they can reduce the dependency on manual transport tasks and lower labor costs over time Liu et al. [13]. Because they offer repeatable performance and fewer handling errors, they can also reduce damage to materials and improve overall process quality. As fleets grow, companies can scale up robot usage more easily than recruiting and training additional operators for the same tasks.

1.2. Software Robots

Alongside physical robots, many industrial processes now rely on software robots. These include rule-based process automation scripts, orchestration services, monitoring agents, and, more recently, large language model-based assistants. Unlike physical robots, virtual robots act on digital information, reading data from databases or logs, communicating with APIs, and issuing commands to other systems, including physical robots Shidaganti et al. [14].
In the context of mobile robots, software agents are responsible for planning and coordinating tasks, interfacing with higher-level systems, and providing feedback to human operators. Traditional approaches rely on fixed rules or handcrafted decision logic: each workflow or exception must be encoded manually. This makes the system rigid and hard to adapt when production requirements change or when unexpected events occur on the shop floor.
LLM-based agents extend these software robots by adding the ability to interpret natural language, combine heterogeneous information, and generate structured plans. They can translate high-level human instructions into concrete missions for mobile robots, propose re-planning options when paths are blocked or orders are delayed, and summarize system status in a way that is easier for operators to understand. Because they are purely software, they are easy to update, copy, and scale, and they can be integrated with existing industrial IT systems without changes to the physical environment.

1.3. LLM-Enhanced Control for Mobile Platforms

This research focuses on the interaction between physical mobile robots and LLM-based software agents in smart industry. The physical robots provide the ability to act in the factory; they move materials, inspect areas, and interact with other equipment. The goal is not to replace existing low-level control and navigation algorithms, but to enhance higher-level decision-making and interactions around them.
In the proposed approach, operators and production planners can express tasks in natural language using terms from their daily work (for example, specifying which line needs material, or which area needs inspection). The LLM agent interprets these instructions, checks contextual information such as current robot locations and production priorities, and then generates structured missions that existing planning and control modules can execute. When disturbances occur, such as blocked routes or urgent orders, the LLM agent can propose alternative plans or rescheduling options and present them in an understandable way.
By combining physical mobile platforms with LLM-based software robots, the system aims to increase flexibility and reduce the engineering effort required to adapt to new products, layouts, or workflows. Instead of modifying low-level code for each change, many adjustments can be made through high-level interaction with the LLM agent. At the same time, physical robots still provide the reliable, repeatable execution needed for industrial environments. This research evaluates how this combined architecture affects efficiency, robustness, and usability in representative smart industry scenarios. Table 1 presents a key overview of hardware and software robots, along with the potential of AI components and methods for the improvement of their operation.
This paper is organized as follows: Section 2 describes the experimental setup, including the evaluated language models, the ROS/ROS2 benchmark suite, and the scoring methodology, and it details the mobile robot control and kinematic modeling background used throughout the study. Section 3 presents quantitative results on quality, coverage, speed, and reliability, together with aggregated rankings and representative prompt examples. Finally, the Conclusion summarizes the key findings and outlines implications for selecting and deploying LLM-based agents in ROS2-driven mobile robotics applications.

2. Materials and Methods

We evaluated six large language models (LLMs), presented in Table 2, as expert agents for Robot Operating System (ROS) and ROS2 development. Models were tested on 40+ prompts covering beginner- to expert-level topics including navigation, perception, hardware integration, and system optimization. Gemma3 achieved the highest composite score (58.4/100), demonstrating superior quality (84.61/100) and topic coverage (47.62/100). Our analysis reveals significant performance variations across different ROS domains, with implications for selecting appropriate models for robotics applications.
While general LLM benchmarks exist Zografos and Moussiades [17], domain-specific evaluation for robotics remains underexplored.
Table 2. Large language models evaluated as ROS/ROS2 expert agents.
Table 2. Large language models evaluated as ROS/ROS2 expert agents.
ModelDescriptionSize
SmolLM2 Allal et al. [18]Small-scale efficient model256.35 (MB)
Llama 3.2 Zhao et al. [19]Meta’s latest open model1.87 (GB)
Gemma3, Gemma3-qat Team et al. [20]Google’s efficient model2.31 (GB)
DeepSeek R1 DistillReasoning-focused distilled model4.6 (GB)
Qwen 3Alibaba’s multilingual model4.7 (GB)
The smallest available GPT-Oss model exceeds 11 GB, which prevents it from fitting into the dedicated GPU memory on our hardware (maximum 4 GB). The same limitation applies to other state-of-the-art models such as DeepSeek-R1 and Qwen 3. This constraint highlights a central challenge of deploying LLMs on embedded platforms: limited on-device resources require highly optimized inference implementations.
We designed a comprehensive test suite with 40+ prompts across five difficulty levels:
  • Beginner (n = 5): Basic concepts, installation, and simple publisher–subscriber topics;
  • Intermediate (n = 6): Services, launch files, custom messages, and parameters;
  • Advanced (n = 6): Actions, lifecycle nodes, navigation, and perception;
  • Expert (n = 6): Performance optimization, hardware integration, and security;
  • Troubleshooting (n = 3): Common issues and debugging.
Each prompt included three to five expected topics for automated coverage analysis.
We evaluated models on multiple dimensions:
  • Quality Score (0–100): Composite measure based on response length, code inclusion, and topic coverage;
  • Coverage Score (0–100%): Percentage of expected topics mentioned;
  • Performance Metrics: Response time, tokens/second;
  • Reliability: Success rate, error rate, and timeout rate;
  • Code Generation: Frequency and quality of code examples.
A weighted composite score combining these metrics follows the formula described in Equation (1):
S = 0.35 × Q + 0.3 × C + 0.2 × S p + 0.15 × R ,
where S is the composite score, Q is the quality of the response, C is the coverage of the subject, S p is the speed of token generation, and R is the reliability of the model, with more interest in quality and coverage, as speed is not our main focus. In Equation (2), each of these metrics have been normed and scaled between 0 and 100 to fit the final formula.
Q = Q s c o r e s N t e s t s , C = C s c o r e s N t e s t s S p = min ( S p s c o r e s N t e s t s max S p s c o r e s , 100 ) R = N s u c c e s s f u l t e s t s N t e s t s × 100
Models were provided with a consistent system prompt establishing them as ROS/ROS2 experts with experience in architecture, navigation, perception, hardware integration, and debugging.
The software robot is implemented as an LLM-based control agent running alongside the ROS2 stack (Figure 1) Koubaa et al. [21]. It exposes a web interface where operators can issue natural-language instructions and inspect execution logs. Internally, the agent maintains a catalog of available topics, services, and actions, and uses a planning layer to map high-level tasks (for example, “inspect zone B and report obstacles”) into concrete ROS2 commands and navigation goals. Execution feedback, diagnostics, and sensor summaries are continuously streamed back to the agent, which interprets them with the LLM and presents concise, human-readable status reports. This architecture allows safety-critical navigation and control loops in conversational ROS2 nodes.
Table 3 summarizes the main hardware components used in the experimental setup, together with their role in the proposed mobile robot platform.
The ASTIbot platform (Figure 2) is a complex industrial-grade mobile robot equipped with a PLC-based control box for the conveyor belt with two digital sensors for material detection and a signal lamp for status indication. It uses four Mecanum wheels driven by independent 24 V DC motors to achieve omnidirectional motion and carries an integrated conveyor module with dedicated sensors for load detection and transfer. The robot also includes a self-charging docking interface for the charging station and additional onboard sensors for safe operation in warehouse and production environments, such as 360° LiDAR and a 3D depth camera Pavel and Stamatescu [22], all managed and controlled by the NVIDIA Jetson Orin Nano embedded GPU development board.

2.1. Mobile Robot Model

The mobile platform used in this work features four omnidirectional Mecanum wheels, which provide a high degree of maneuverability but also increase the complexity of its kinematic model. Depending on the direction and rotational speed of each wheel, the robot can move in any direction without rotating its chassis around the vertical axis Williams et al. [23]. Example motion patterns are illustrated in Figure 3.
To analyze how wheel motions relate to the commands sent by the controller, we consider both the direct and inverse kinematic models. The controller receives velocity commands of the form u T = [ v x , v y , w z ] T , which are then decomposed into individual wheel speeds for each motor, as summarized in Table 4.

2.2. Direct Kinematics

From Figure 3 it can be seen that the linear velocity along the O x -axis is given by the sum of the wheel velocities, whereas the linear velocity along the O y -axis and the angular velocity require a different treatment. For the linear velocity along O y , we consider the resulting direction of motion and the fact that there are always two pairs of wheels whose rotation directions are opposite. The same principle applies when determining the platform’s angular velocity, but in this case we must also account for the distance between the system’s center of gravity and the reference wheel. All of these relationships can be expressed compactly by the Equation (3). For simplicity, we assume that the wheel radius is equal to one.
u = v x v y ω z = 1 1 1 1 1 1 1 1 1 l x + l y 1 l x + l y 1 l x + l y 1 l x + l y w 1 w 2 w 3 w 4 = T w 1 w 2 w 3 w 4
where l x is the distance from the robot’s center of rotation to the wheel along the x-axis (half the robot’s length), and l y is the distance from the robot’s center of rotation to the wheel along the y-axis (half the robot’s width).

2.3. Inverse Kinematics

Starting from Equation (3), the inverse kinematics can be derived by inverting the transformation matrix T . Since T is not square, we instead employ its pseudo-inverse T + , defined to satisfy the condition in Equation (4).
T T + = T T T ( T T T ) 1 = I 3
Using this definition, the resulting pseudo-inverse can be written as in Equation (5).
T + = 1 4 1 1 ( l x + l y ) 1 1 ( l x + l y ) 1 1 ( l x + l y ) 1 1 ( l x + l y )
Consequently, the wheel angular velocities are obtained from the relationship given in Equation (6).
w 1 w 2 w 3 w 4 = T + u = 1 4 1 1 ( l x + l y ) 1 1 ( l x + l y ) 1 1 ( l x + l y ) 1 1 ( l x + l y ) v x v y ω z

2.4. System Control

For the control problem, we are interested in commanding the motion of the entire robotic system, and therefore focus on inputs of the form u T = [ v x , v y , ω z ] T . To control the mobile platform, we require its current position and the orientation of the robot’s front (camera) with respect to a global coordinate frame.
As shown in Figure 4, the robot state can be represented by three variables: the Cartesian coordinates ( x , y ) in the global xy plane and the platform orientation θ relative to the x-axis. This convention is also commonly used in the ROS development environment. Consequently, the state vector for the kinematic model is defined as x T = [ x , y , θ ] T , and together with the control input u T = [ v , ω ] T (linear and angular velocities), the system dynamics are given by Equation (7).
x ˙ = x ˙ y ˙ θ ˙ = cos θ 0 sin θ 0 0 1 v ω
As shown in Figure 5, if the desired destination is represented by the state vector x d T = [ x d , y d , θ d ] T , the objective of the controller is to minimize the tracking error defined in Equation (8).
lim t e = lim t x e y e θ e = lim t cos θ sin θ 0 sin θ cos θ 0 0 0 1 x d x y d y θ d θ 0 0 0
It was demonstrated by Gao et al. [24] that the tracking-error dynamics converge asymptotically to zero by employing a Lyapunov-based analysis. In their work, a suitable Lyapunov candidate function was constructed and proven to be monotonically decreasing, which in turn guaranteed the stability of the closed-loop system. The ASTIbot platform runs a complete vendor-provided ROS stack that already exposes all low-level base control, safety functions, and monitoring interfaces, including a local web server for manual teleoperation and diagnostics (Figure 6). Rather than replacing this existing infrastructure, our LLM-based control agent connects to the available topics, services, and frames, and operates purely at the command and interpretation layer. Building on the benchmarking results reported in Section 2—where Gemma3 demonstrated the best balance between command-generation quality, coverage, and reliability—this integration represents the intended “finish line” of our evaluation: an industrial AI agent Salazar and Vogel-Heuser [25] that can observe the live robot state, generate valid ROS2 commands, and provide high-level, natural-language status reports to end-users, all on top of an unmodified industrial robot stack.
Table 5 summarizes the robot’s hardware platform, listing the main components (computer model, CPU, GPU, memory, operating system, and role) together with their technical specifications used in our experimental setup.

3. Results

Table 6 and Table 7 show examples of prompts used to test the agent. Our full implementation is openly available under the dedicated GitHub repository [26] for replicable benchmarking and evaluation.
Figure 7 shows the overall performance comparison. SmolLM2 achieved the fastest response time (5.22 s) and token generation speed (177.1 tokens/second), while Gemma3 and Gemma3-qat demonstrated the highest quality scores (84.6 and 85.6) and topic coverages (47.6% and 48.6%).
Following the results shown in Figure 8, we see that not all models are working with our environment; larger models such as DeepSeek-r1 and Qwen3 are giving timeouts as they cannot load in time to process our requests, but all the other models show 100% reliability.
Errors only appear when the model is not available or is not downloaded or when there is no connection with the containers, which in our case never occurs as we only test our machine on the local environment.
Performance varied significantly across ROS domains (Figure 9). Models generally performed well on basic tasks but showed varied expertise in specialized areas like Launch Files, Lifecycles, and Security. Llama 3.2 excelled in Basic Programming and Launch Files, and Gemma3-qat was the only one to cover 100% in Debugging, Real-time Systems, and TF.
Figure 10 summarizes the relative strengths of each model across all evaluation dimensions. Gemma3 and Gemma3-qat show the most balanced performance, combining high quality and coverage with competitive speed and reliability. SmolLM2 stands out for its superior speed but lags in coverage, while Llama 3.2 provides strong quality but slightly lower robustness in some specialized ROS domains.
Based on our weighted composite score, Gemma3-qat ranks first (62.7/100), followed by Gemma3 (62.1/100) (Figure 11). The breakdown reveals that the slight difference in scores comes from the quality of the response and the coverage of the topics, which is not a surprise, as the Quaternization version of the model offers comparable performance and quality but on a reduced memory footprint.
Our results (Table 8) reveal three key patterns:
  • Domain-specific performance varies: All models showed strengths in certain areas, indicating specialization opportunities.
  • Quality–speed trade-offs exist: Practitioners can choose models based on whether they prioritize response quality or generation speed.
  • Model size does not guarantee adequacy: Although Llama 3.2 achieves a higher quality score (80.76) than SmolLM2 (62.5), their overall composite scores are comparable—57.36 for Llama 3.2 versus 58.42 for SmolLM2—because SmolLM2 performs better on the other weighted evaluation metrics. If our priority is speed and coverage of basic topics at the level of an operator, then the best option will be to use a small-sized model. Gemma3 and Gemma3-qat models are only 20% larger than Llama 3.2, but the difference in completeness response is almost double and the time of response is almost the same.
Our evaluation has several limitations:
  • Automated quality scoring may not capture all aspects of expertise;
  • Expected topics are subjective and may miss valid alternative approaches;
  • Testing was conducted in English only;
  • Evaluation focused on text responses, not executable code verification;
  • Results may vary with different system prompts.
Based on these results, we moved to stage 2 for evaluation, where we built an agent able to answer general questions but also to generate ROS2 commands when needed to interact with the mobile robot, checking the ROS topics and services and interpreting the result. For this, we prepared a special system prompt to enable the agent access to the available ROS topics and services opened on our robot and specified an explicit format on which it could respond to work with our API. This application illustrated the practical integration of LLMs within a ROS2-based workflow, using the ASTIbot mobile robotic platform as a representative case study.
Figure 12 presents the command-level performance of the evaluated LLMs when interacting with the ROS2-based mobile robot. Gemma3 and Gemma3-qat achieve the highest command generation and interpretation scores, producing a larger proportion of valid commands and correct interpretations of ROS feedback. SmolLM2 remains competitive in terms of valid-command rate but shows lower overall quality, while Llama 3.2 offers strong but slightly less consistent behavior across the four metrics. These results confirm that models with better general ROS expertise also translate into more reliable closed-loop command execution.
Figure 13 shows that command-generation performance is consistently higher than the interpretation performance across most task categories, particularly for navigation and diagnostics. Gemma3 and Gemma3-qat maintain the most stable command-quality scores, producing valid ROS2 commands even in more complex scenarios such as multi-step navigation goals or combined diagnostic checks. Interpretation scores, in contrast, vary more strongly between models and categories, reflecting differences in how reliably each model summarizes status messages, error codes, and log outputs. Interestingly, from a practical robotics perspective, this imbalance is acceptable, and even preferable: generating correct commands is safety-critical and difficult to post-correct automatically, while interpretation logic (e.g., mapping raw feedback into human-readable summaries or dashboard labels) can be iteratively refined in downstream components over time. In other words, a model with slightly lower interpretation scores but consistently high command-generation quality is more suitable for closed-loop control, since command templates are harder to remediate externally, whereas interpretation layers can be adjusted and specialized as the system evolves.
The radar comparison in Figure 14 highlights the relative strengths and weaknesses of the evaluated models at the command level. Gemma3 and Gemma3-qat form the largest and most regular profiles, confirming their balanced performance across all four dimensions: they combine high command-generation quality with consistently strong interpretation of ROS2 feedback. SmolLM2 exhibits a pronounced peak on the speed-related and valid-command axes but a noticeably smaller radius on quality-oriented metrics, indicating that it is well suited for fast, routine interactions but less reliable for complex, multi-step tasks. Llama 3.2 lies between these extremes, delivering high-quality commands and good interpretation performance, albeit with slightly lower robustness in some diagnostic and configuration scenarios. Overall, the radar plot reinforces the conclusion that models optimized for domain expertise (Gemma3/Gemma3-qat) provide the best trade-off for closed-loop control, whereas smaller models such as SmolLM2 are attractive when resource constraints and response times are dominant concerns.
We describe the final scores in Equation (9) as a composite of the following formula:
S = 0.25 × C + 0.25 × I + 0.3 × B V + 0.2 × S p ,
where S is the composite score, C is the correct command generation, I is the correct interpretation of the results received after calling the ROS command, B V is the validity of both the generated command and the interpreted results, and T is the time needed to complete the task. All these metrics are scaled between 0 and 100 in Equation (10):
C = C s c o r e s N t e s t s , I = I s c o r e s N t e s t s B V = C v a l i d + I v a l i d N t e s t s × 100 S p = max ( 0 , 100 T o t a l t i m e N t e s t s max ( T o t a l t i m e ) × 100 )
Using the composite metric, we obtain the final command-level rankings summarized in Figure 15. Models that score highly on the “Both Valid” component dominate the top positions, confirming that simultaneous correctness of commands and interpretations is more decisive than raw speed or isolated quality metrics. Gemma3 and Llama 3.2 achieve strong scores across all four dimensions, while smaller models like SmolLM2 benefit from the speed weighting but are penalized when either commands or interpretations fail. This ranking underlines that, for closed-loop control of mobile robots, robustness of the entire interaction cycle is more critical than maximizing any single metric in isolation.
Figure 16 illustrates the web application that mediates interaction between human operators, AI control agents, and the simulated TurtleBot3 platform, developed by the Open Source Robotics Foundation, Mountain View, California USA. The home dashboard (Figure 16a) provides an overview of active ROS2 topics and services, allowing the operator to quickly verify connectivity and system status. The log view (Figure 16b) records all prompts, generated ROS2 commands, and feedback messages, which we use to analyze the agent’s behavior and identify typical failure modes. The chat interface (Figure 16c) is the main entry point for natural-language task specification, from which the agent derives structured commands for navigation and diagnostics. Finally, the location management page (Figure 16d) stores commonly used waypoints in the simulated environment, enabling repeatable experiments and rapid comparison of different models under identical task sequences.
In addition to the web-based control interface, we evaluated the agent in a full ROS2 navigation setup using the standard TurtleBot3 simulation stack. Figure 17 illustrates the simulated environment and the corresponding navigation view, where the LLM-based control agent issues high-level commands that are translated into concrete goals for the ROS2 navigation stack.
Deploying the agent in the simulation demonstrates that it can successfully execute basic navigation and monitoring tasks in a ROS2 environment (Figure 18 and Figure 19). The agent reports the robot status using simple natural-language descriptions, stores and reuses named locations on the map, and interacts with the ROS2 navigation stack to send goal positions and monitor progress. It also interprets sensor information via the prepared ROS topics and analysis tools, allowing it to answer operator queries about the surrounding environment with structured, quantitative statements (for example, reporting the number, type, and relative position of nearby obstacles).
Performing this on the real system is illustrated below.
Our LLM-based agent connects to this pre-existing stack, issuing ROS commands and interpreting feedback via natural language, without modifying the underlying control software.
Our experiments confirm that the proposed agent can both monitor and control the real platform. Figure 20 depicts the mapped laboratory environment used for validation, where the robot charging station and the CPS storage station are highlighted in green and red, respectively. The results show that the agent integrates successfully with the existing ROS stack and enables operators to supervise and command the platform through natural language via the web interface. By translating user requests into consistent, human-readable commands, the system increases the level of abstraction available to end-users and improves human–machine interaction.

4. Conclusions

This paper introduced an industrial-level application of LLMs for the end-user to monitor and control a mobile robotic platform in natural language using advanced technologies, such as ROS and Docker Model Runner. We found that this study achieved its goals by demonstrating an end-to-end methodology for LLM-based interaction and control of industrial mobile robot platforms.
This work highlights the growing role of AI assistants in robotics development, where large language models can support design, debugging, and integration of complex systems. To be truly useful in this context, models must be evaluated in realistic, domain-specific settings rather than only on general-purpose benchmarks. For mobile robots and smart industry applications, ROS/ROS2 remains the core middleware Carreira et al. [27], so understanding how well different LLMs handle ROS/ROS2 concepts and workflows is essential.
In this study, we focused on four main questions: (i) which LLMs best understand ROS/ROS2 concepts and typical development tasks; (ii) how model performance changes across different difficulty levels, from simple API usage to multi-node integration; (iii) what quality–speed trade-offs arise when using different models in an interactive development loop; and (iv) which models are reliable enough to be considered for production use. Based on our experiments, we show that Gemma3 is a good fit if we talk about reliability, quality, and coverage of ROS-specific topics, and we discuss how these findings can guide both practitioners choosing AI assistants for ROS/ROS2 projects and researchers designing future LLMs tailored to robotics.
Future work will be focused on embedded deployment and benchmarking of the LLM systems on the mobile robotic platform and the evaluation of increasingly complex tasks and scenarios.

Author Contributions

Conceptualization, M.-D.P. and G.S.; methodology, M.-D.P.; software, M.-D.P.; validation, G.S., M.C. and C.G.A.; formal analysis, C.G.A.; resources, M.C.; data curation, G.S.; writing—original draft preparation, M.-D.P.; writing—review and editing, G.S.; visualization, M.-D.P.; supervision, G.S.; project administration, M.C.; funding acquisition, C.G.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially funded by the “Building Opportunities for SMEs Sustainable Entrepreneurship with Artificial Intelligence in Industry 5.0” (BOOST-AI), project number 2024-1-RO01-KA220-HED-000246238, co-funded by the European Union within the Erasmus+ program.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article; further inquiries can be directed to the corresponding author.

Conflicts of Interest

Authors Mihai-Daniel Pavel and Grigore Stamatescu were employed by the company Asti Automation. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AGVAutomated Guided Vehicle
AMRAutonomous Mobile Robot
AIArtificial Intelligence
CNNConvolutional Neural Network
CPUCentral Processing Unit
GPUGraphics Processing Unit
LLMLarge Language Model
NNNeural Network
OSOperating System
ROSRobot Operating System

References

  1. Wang, J.; Shi, E.; Hu, H.; Ma, C.; Liu, Y.; Wang, X.; Yao, Y.; Liu, X.; Ge, B.; Zhang, S. Large language models for robotics: Opportunities, challenges, and perspectives. J. Autom. Intell. 2025, 4, 52–64. [Google Scholar] [CrossRef]
  2. Zeng, F.; Gan, W.; Huai, Z.; Sun, L.; Chen, H.; Wang, Y.; Liu, N.; Yu, P.S. Large Language Models for Robotics: A Survey. arXiv 2025, arXiv:2311.07226. [Google Scholar]
  3. Liu, Y.; He, H.; Han, T.; Zhang, X.; Liu, M.; Tian, J.; Zhang, Y.; Wang, J.; Gao, X.; Zhong, T.; et al. Understanding LLMs: A Comprehensive Overview from Training to Inference. arXiv 2024, arXiv:2401.02038. [Google Scholar] [CrossRef]
  4. Okumuş, F.; Dönmez, E.; Kocamaz, A.F. A Cloudware Architecture for Collaboration of Multiple AGVs in Indoor Logistics: Case Study in Fabric Manufacturing Enterprises. Electronics 2020, 9, 2023. [Google Scholar] [CrossRef]
  5. Bernardo, R.; Sousa, J.M.; Gonçalves, P.J. Survey on robotic systems for internal logistics. J. Manuf. Syst. 2022, 65, 339–350. [Google Scholar] [CrossRef]
  6. Rainer, R.K., Jr.; Richey, R.G., Jr.; Chowdhury, S. How Robotics is Shaping Digital Logistics and Supply Chain Management: An Ongoing Call for Research. J. Bus. Logist. 2025, 46, e70005. [Google Scholar] [CrossRef]
  7. Dong, W.; Li, D.; Ji, Y.; Chen, H.; Liu, S.; Ma, Z.; Hao, F.; Ji, Y.; Xing, H.; Zheng, P. Towards a next-generation LLM empowered low-code programming industrial robotic system for human-centric smart manufacturing. J. Manuf. Syst. 2025, 83, 675–686. [Google Scholar] [CrossRef]
  8. Vidyaratne, L.; Lee, X.Y.; Kumar, A.; Watanabe, T.; Farahat, A.; Gupta, C. Generating troubleshooting trees for industrial equipment using large language models (llm). In Proceedings of the 2024 IEEE International Conference on Prognostics and Health Management (ICPHM), Beijing, China, 11–13 October 2024; pp. 116–125. [Google Scholar]
  9. Sherwani, F.; Asad, M.M.; Ibrahim, B.S.K.K. Collaborative robots and industrial revolution 4.0 (ir 4.0). In Proceedings of the 2020 International Conference on Emerging Trends in Smart Technologies (ICETST), Karachi, Pakistan, 26–27 March 2020; pp. 1–5. [Google Scholar]
  10. Agostinelli, S.; Lupia, M.; Marrella, A.; Mecella, M. Reactive synthesis of software robots in RPA from user interface logs. Comput. Ind. 2022, 142, 103721. [Google Scholar] [CrossRef]
  11. Pavel, M.D.; Roșioru, S.; Arghira, N.; Stamatescu, G. Control of open mobile robotic platform using deep reinforcement learning. In International Workshop on Service Orientation in Holonic and Multi-Agent Manufacturing; Springer: Berlin/Heidelberg, Germany, 2022; pp. 368–379. [Google Scholar]
  12. Raj, R.; Kos, A. A comprehensive study of mobile robot: History, developments, applications, and future research perspectives. Appl. Sci. 2022, 12, 6951. [Google Scholar] [CrossRef]
  13. Liu, H.; Zhu, Y.; Kato, K.; Tsukahara, A.; Kondo, I.; Aoyama, T.; Hasegawa, Y. Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration. IEEE Robot. Autom. Lett. 2024, 9, 6904–6911. [Google Scholar] [CrossRef]
  14. Shidaganti, G.; Karthik, K.; Anvith; Kantikar, N.A. Integration of RPA and AI in Industry 4.0. In Confluence of Artificial Intelligence and Robotic Process Automation; Springer: Berlin/Heidelberg, Germany, 2023; pp. 267–288. [Google Scholar]
  15. Pavel, M.D.; Stamatescu, G. Flexible Manufacturing System for Enhanced Industry 4.0 and Industry 5.0 Applications. In Proceedings of the 2024 20th International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT), Abu Dhabi, United Arab Emirates, 29 April–1 May 2024; pp. 483–490. [Google Scholar] [CrossRef]
  16. Rosioru, S.; Mihai, V.; Neghina, M.; Craciunean, D.; Stamatescu, G. PROSIM in the Cloud: Remote Automation Training Platform with Virtualized Infrastructure. Appl. Sci. 2022, 12, 3038. [Google Scholar] [CrossRef]
  17. Zografos, G.; Moussiades, L. Beyond the Benchmark: A Customizable Platform for Real-Time, Preference-Driven LLM Evaluation. Electronics 2025, 14, 2577. [Google Scholar] [CrossRef]
  18. Allal, L.B.; Lozhkov, A.; Bakouch, E.; Blázquez, G.M.; Penedo, G.; Tunstall, L.; Marafioti, A.; Kydlíček, H.; Lajarín, A.P.; Srivastav, V.; et al. SmolLM2: When Smol Goes Big–Data-Centric Training of a Small Language Model. arXiv 2025, arXiv:2502.02737. [Google Scholar]
  19. Zhao, J.; Jin, Z.; Zeng, P.; Sheng, C.; Wang, T. An Anomaly Detection Method for Oilfield Industrial Control Systems Fine-Tuned Using the Llama3 Model. Appl. Sci. 2024, 14, 9169. [Google Scholar] [CrossRef]
  20. Team, G.; Mesnard, T.; Hardin, C.; Dadashi, R.; Bhupatiraju, S.; Pathak, S.; Sifre, L.; Rivière, M.; Kale, M.S.; Love, J.; et al. Gemma: Open models based on gemini research and technology. arXiv 2024, arXiv:2403.08295. [Google Scholar] [CrossRef]
  21. Koubaa, A.; Ammar, A.; Boulila, W. Next-generation human-robot interaction with ChatGPT and robot operating system. Software Pract. Exp. 2025, 55, 355–382. [Google Scholar] [CrossRef]
  22. Pavel, M.D.; Stamatescu, G. Sustainable Manufacturing Application of Embedded Learning Algorithms for Vision-based Defect Detection under the Industry 5.0 Paradigm. In Proceedings of the 2025 33rd Mediterranean Conference on Control and Automation (MED), Tangier, Morocco, 10–13 June 2025; pp. 185–190. [Google Scholar] [CrossRef]
  23. Williams, R.; Carter, B.; Gallina, P.; Rosati, G. Dynamic model with slip for wheeled omnidirectional robots. IEEE Trans. Robot. Autom. 2002, 18, 285–293. [Google Scholar] [CrossRef]
  24. Gao, X.; Gao, R.; Liang, P.; Zhang, Q.; Deng, R.; Zhu, W. A Hybrid Tracking Control Strategy for Nonholonomic Wheeled Mobile Robot Incorporating Deep Reinforcement Learning Approach. IEEE Access 2021, 9, 15592–15602. [Google Scholar] [CrossRef]
  25. Salazar, L.A.C.; Vogel-Heuser, B. Industrial artificial intelligence: A predictive agent concept for industry 4.0. In Proceedings of the 2022 IEEE 20th International Conference on Industrial Informatics (INDIN), Perth, Australia, 25–28 July 2022; pp. 27–32. [Google Scholar]
  26. Pavel, M.D. Model Runner Web App Prompts. Version 1.0.0. 2025. Available online: https://github.com/mihaidanielPavel/ (accessed on 12 January 2026).
  27. Carreira, R.; Costa, N.; Ramos, J.; Frazão, L.; Pereira, A. A ROS2-Based Gateway for Modular Hardware Usage in Heterogeneous Environments. Sensors 2024, 24, 6341. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Architecture of the LLM-enhanced software robot for ROS2-based platforms. The agent connects the user web interface with the underlying ROS2 ecosystem, combining a large language model, a command-planning layer, and ROS2 integration nodes that access topics, services, and logs. High-level natural-language requests are translated into structured ROS2 commands and missions, while feedback from the robot and infrastructure is aggregated and summarized back to the user.
Figure 1. Architecture of the LLM-enhanced software robot for ROS2-based platforms. The agent connects the user web interface with the underlying ROS2 ecosystem, combining a large language model, a command-planning layer, and ROS2 integration nodes that access topics, services, and logs. High-level natural-language requests are translated into structured ROS2 commands and missions, while feedback from the robot and infrastructure is aggregated and summarized back to the user.
Applsci 16 01680 g001
Figure 2. Mobile robot platform (ASTIbot) used for real-life validation experiments: (a) rear view with detailed components of the platform; (b) frontal view.
Figure 2. Mobile robot platform (ASTIbot) used for real-life validation experiments: (a) rear view with detailed components of the platform; (b) frontal view.
Applsci 16 01680 g002
Figure 3. Examples of omnidirectional motion patterns of the Mecanum-wheeled mobile platform.
Figure 3. Examples of omnidirectional motion patterns of the Mecanum-wheeled mobile platform.
Applsci 16 01680 g003
Figure 4. Global and robot-fixed coordinate frames used to define the mobile platform pose.
Figure 4. Global and robot-fixed coordinate frames used to define the mobile platform pose.
Applsci 16 01680 g004
Figure 5. Geometric definition of the pose-tracking error between the robot state and the desired goal.
Figure 5. Geometric definition of the pose-tracking error between the robot state and the desired goal.
Applsci 16 01680 g005
Figure 6. Mobile robot local web server and ROS node stack used as the integration layer for the AI control agent: (a) Vendor-provided web interface for manual base control and real-time status monitoring on the ASTIbot platform. (b) RViz visualization of the full robot model and sensor configuration.
Figure 6. Mobile robot local web server and ROS node stack used as the integration layer for the AI control agent: (a) Vendor-provided web interface for manual base control and real-time status monitoring on the ASTIbot platform. (b) RViz visualization of the full robot model and sensor configuration.
Applsci 16 01680 g006aApplsci 16 01680 g006b
Figure 7. Overall performance comparison across LLMs. (a) Average response time, (b) generation speed (tokens/s), (c) quality score (0–100), (d) topic coverage percentage.
Figure 7. Overall performance comparison across LLMs. (a) Average response time, (b) generation speed (tokens/s), (c) quality score (0–100), (d) topic coverage percentage.
Applsci 16 01680 g007
Figure 8. Model reliability across evaluated LLMs, reporting the proportion of successful responses, errors, and timeouts in the ROS/ROS2 benchmarking environment.
Figure 8. Model reliability across evaluated LLMs, reporting the proportion of successful responses, errors, and timeouts in the ROS/ROS2 benchmarking environment.
Applsci 16 01680 g008
Figure 9. Heatmap showing quality scores across different ROS/ROS2 categories. Darker colors indicate better performance.
Figure 9. Heatmap showing quality scores across different ROS/ROS2 categories. Darker colors indicate better performance.
Applsci 16 01680 g009
Figure 10. Radar chart comparing the evaluated language models across multiple dimensions (quality, topic coverage, speed, reliability, and completeness).
Figure 10. Radar chart comparing the evaluated language models across multiple dimensions (quality, topic coverage, speed, reliability, and completeness).
Applsci 16 01680 g010
Figure 11. Model rankings based on weighted composite scores. (a) Overall ranking; (b) score breakdown by component.
Figure 11. Model rankings based on weighted composite scores. (a) Overall ranking; (b) score breakdown by component.
Applsci 16 01680 g011
Figure 12. Overall performance comparison across LLMs. (a) Command generation quality (scores, 0–100), (b) interpretation quality (scores, 0–100), (c) valid-command success rate (%), (d) valid interpretation success rate (%).
Figure 12. Overall performance comparison across LLMs. (a) Command generation quality (scores, 0–100), (b) interpretation quality (scores, 0–100), (c) valid-command success rate (%), (d) valid interpretation success rate (%).
Applsci 16 01680 g012
Figure 13. Category-wise command-generation performance of the evaluated LLMs. Each subplot reports per-category scores for (a) command generation quality (0–100) and (b) interpretation quality (0–100). Higher command-generation scores indicate a larger proportion of syntactically and semantically valid ROS2 commands, while interpretation scores reflect how accurately models summarize and explain ROS feedback across navigation, perception, diagnostics, and system-configuration tasks (%).
Figure 13. Category-wise command-generation performance of the evaluated LLMs. Each subplot reports per-category scores for (a) command generation quality (0–100) and (b) interpretation quality (0–100). Higher command-generation scores indicate a larger proportion of syntactically and semantically valid ROS2 commands, while interpretation scores reflect how accurately models summarize and explain ROS feedback across navigation, perception, diagnostics, and system-configuration tasks (%).
Applsci 16 01680 g013
Figure 14. Radar plot summarizing command-level performance of the evaluated LLMs. Each axis corresponds to one evaluation dimension (command-generation mean score, interpretation mean score, both valid scores, speed score, and consistency of the model). Larger, more regular polygons indicate models that achieve a more balanced trade-off between ROS2 command generation and robust interpretation of feedback across all metrics.
Figure 14. Radar plot summarizing command-level performance of the evaluated LLMs. Each axis corresponds to one evaluation dimension (command-generation mean score, interpretation mean score, both valid scores, speed score, and consistency of the model). Larger, more regular polygons indicate models that achieve a more balanced trade-off between ROS2 command generation and robust interpretation of feedback across all metrics.
Applsci 16 01680 g014
Figure 15. Final command-level rankings of the evaluated LLMs based on the composite score that combines command-generation quality (25%), interpretation quality (25%), the rate of interactions where both command and interpretation are valid (30%), and speed (20%). Models with higher composite scores achieve a more favorable trade-off between generating correct ROS2 commands, reliably interpreting feedback, and responding within practical latency bounds.
Figure 15. Final command-level rankings of the evaluated LLMs based on the composite score that combines command-generation quality (25%), interpretation quality (25%), the rate of interactions where both command and interpretation are valid (30%), and speed (20%). Models with higher composite scores achieve a more favorable trade-off between generating correct ROS2 commands, reliably interpreting feedback, and responding within practical latency bounds.
Applsci 16 01680 g015
Figure 16. Web application used to interact with the AI control agents and to monitor the simulated TurtleBot3 mobile robot: (a) Home dashboard summarizing the status of connected ROS2 topics, services, and agents. (b) Log view displaying recent interactions, generated commands, and execution feedback used to test the agent’s abilities. (c) Chat interface where operators issue natural-language instructions that are translated into ROS2 commands. (d) Location management page for defining and storing frequently used waypoints in the simulated environment.
Figure 16. Web application used to interact with the AI control agents and to monitor the simulated TurtleBot3 mobile robot: (a) Home dashboard summarizing the status of connected ROS2 topics, services, and agents. (b) Log view displaying recent interactions, generated commands, and execution feedback used to test the agent’s abilities. (c) Chat interface where operators issue natural-language instructions that are translated into ROS2 commands. (d) Location management page for defining and storing frequently used waypoints in the simulated environment.
Applsci 16 01680 g016
Figure 17. ROS2 TurtleBot3 navigation simulation environment: (a) 3D simulated warehouse-like world in Gazebo with the TurtleBot3 platform. (b) RViz2 view showing the full ROS2 navigation stack, including the robot model, map, local and global paths, and active navigation goals.
Figure 17. ROS2 TurtleBot3 navigation simulation environment: (a) 3D simulated warehouse-like world in Gazebo with the TurtleBot3 platform. (b) RViz2 view showing the full ROS2 navigation stack, including the robot model, map, local and global paths, and active navigation goals.
Applsci 16 01680 g017
Figure 18. ROS2 computation graph of the mobile robot stack: navigation, perception, and low-level control nodes and their communication links.
Figure 18. ROS2 computation graph of the mobile robot stack: navigation, perception, and low-level control nodes and their communication links.
Applsci 16 01680 g018
Figure 19. TF transform tree of coordinate frames used for localization, odometry, and motion control on the mobile platform.
Figure 19. TF transform tree of coordinate frames used for localization, odometry, and motion control on the mobile platform.
Applsci 16 01680 g019
Figure 20. RViz view of the mobile robot in our laboratory.
Figure 20. RViz view of the mobile robot in our laboratory.
Applsci 16 01680 g020
Table 1. Overview of physical and software (virtual) robots, their characteristics, their applications, and the impact of AI.
Table 1. Overview of physical and software (virtual) robots, their characteristics, their applications, and the impact of AI.
RobotsCharacteristicsApplicationsAI Impact
Physical industrial robotsHigh-precision, repeatable motions; can handle heavy loads; continuous operation with minimal breaks; high upfront investment but long-term cost savings.Assembly, welding, painting, material handling in manufacturing Pavel and Stamatescu [15]; logistics and warehousing.AI-based perception and control improve path planning, quality inspection, and adaptive behavior to variations in the environment.
Mobile service robotsAutonomous navigation in dynamic environments; interaction with humans and objects; equipped with sensors and onboard computing.Hospital logistics, delivery robots, inspection and maintenance, hospitality and retail.AI enables robust localization, mapping, and human–robot interaction, extending deployment to unstructured and crowded spaces.
Software process robots (RPA)Operate in purely digital environments; mimic human interactions with user interfaces and APIs; highly scalable and easily replicable.Back-office processes (invoice handling, order processing, report generation), integration of legacy IT systems.ML and NLP allow bots to handle semi-structured data, classify documents, and make context-aware decisions.
Cloud orchestration and infrastructure botsMonitor and manage distributed computational resources; automatically deploy, scale, and update services.Fleet management platforms for robots, cloud-based perception or planning services, CI/CD pipelines for robotic software.AI-driven policies optimize resource allocation, predict load, and enable self-healing infrastructures that support large-scale robotic deployments.
Digital twin and simulation agentsVirtual replicas of physical robots and environments Rosioru et al. [16]; run accelerated simulations and what-if scenarios.Design and testing of robotic cells, optimization of production lines, energy and throughput analysis, operator training.AI uses simulation data for policy learning and transfer learning to improve real-world performance.
Data-analysis and monitoring agentsContinuously analyze sensor and log data; detect anomalies and predict failures; provide decision-support dashboards.Predictive maintenance for robots, quality control, energy optimization, safety monitoring in industrial plants.Advanced ML models increase accuracy of failure prediction, reduce false alarms, and support proactive interventions that minimize downtime.
Human-facing conversational agentsNatural-language interfaces to robotic systems; support operators and end-users with explanations and task specification.Voice or chat interfaces for configuring robots, training support, remote assistance and teleoperation.LLMs enable more intuitive multimodal interaction, automatic documentation, and human–machine collaboration.
Table 3. Hardware platforms and technical specifications.
Table 3. Hardware platforms and technical specifications.
Computer modelASUS Vivobook Pro 15 OLEDNVIDIA Jetson Orin Nano 4 GB
CPUAMD Ryzen 9 5900HX6-core Arm Cortex-A78AE v8.2
GPUNVIDIA GeForce RTX 3050512-core NVIDIA Ampere architecture GPU with 16 tensor cores
MemoryDedicated GPU memory 4 GBShared RAM memory 4 GB
OSMS Windows 11 ProUbuntu 20.04.6 LTS
RoleRunning software LLM agentsControlling hardware mobile platform
Table 4. Mobile platform wheel geometry and roller angles used in the kinematic model.
Table 4. Mobile platform wheel geometry and roller angles used in the kinematic model.
Wheel IndexPositionAssociated VelocityAngle of Rollers
1Front-left w 1 −45°
2Front-right w 2 45°
3Rear-left w 3 45°
4Rear-right w 4 −45°
Table 5. Hardware components and specifications of the mobile robot platform.
Table 5. Hardware components and specifications of the mobile robot platform.
NameDescription
NVIDIA Jetson Orin Nano
  • ARM Cortex-A78E processor
  • 4 GB RAM LPDDR5
  • MicroSD card slot
  • Gigabit Ethernet (10/100/1000BASE-T)
  • 4 × USB 3.0 Type-A ports
  • 1 × USB-C port
LsLiDAR M10 Scanner
  • TOF (Time-of-Flight) principle
  • 2D scanning of the surrounding 360° environment
  • Measuring frequency: 10 Hz
  • Accuracy: ±3 cm
  • Maximum range: 25 m
ASTRA Depth Camera
  • Distance range: 0.6–8 m
  • RGB image resolution: 1920 × 1080 @ 30 fps
  • Depth image resolution: 640 × 480 @ 30 fps
  • Depth precision: ±3 mm @ 1 m
MD60 100W DC Motor
  • Chassis Wheel Independent Suspension Shock Absorber Damper MD60
  • Reduction ratio: 1:18
  • Motor: DC Brush, rated voltage 24 V, rated power 100 W
  • Rated current: 6 A
  • Rated speed: 175 rpm
  • Rated torque: 55.7 kg·cm
Omnidirectional Wheels (Mecanum)
  • Material: stainless steel + rubber
  • Weight: 700 g (per wheel)
  • Diameter: 152.4 mm
  • Width: 55.5 mm
  • Roller dimensions: 20.4 mm × 41.3 mm
Power Supply2 × VRLA Ultracell 12 V battery, 22 Ah (UCG22-12)
Total Weight45 kg
Payload∼30 kg
Dimensions (W × H × D)73.5 cm × 62 cm × 100 cm
Table 6. Representative ROS/ROS2 benchmark prompts by skill level and topic category.
Table 6. Representative ROS/ROS2 benchmark prompts by skill level and topic category.
LevelCategoryRepresentative PromptMain Topics
BeginnerBasicsDifference between ROS and ROS2? When to use each?DDS, real-time, lifecycle
IntermediateServicesTopics vs. services + Python service examplerequest–response, srv, client–server
AdvancedActionsAction server with feedback (Python 3)goal, feedback, result, ActionServer
ExpertSecurityImplement SROS2 for production systemDDS security, certificates, encryption
ExpertMigrationBest strategy: large ROS 1 → ROS 2 migrationros1_bridge, gradual migration
TroubleshootingDebuggingNodes cannot see each other—troubleshootdomain_id, discovery, firewall
Table 7. Natural-language robot queries with expected ROS topics and command examples.
Table 7. Natural-language robot queries with expected ROS topics and command examples.
CategoryRepresentative User QueryKey Expected TopicsTypical ROS Command
PositionWhere is the robot right now?pose, position, location, odom, coordinatesros2 topic echo /robot_pose
BatteryWhat is the battery level?battery, percentage, voltage, chargingros2 topic echo /battery_status
StatusWhat is the robot status?status, state, activity, idle, navigatingros2 topic echo /robot_status
SensorsAre there any obstacles around the robot?scan, lidar, obstacles, range, laserros2 topic echo /scan
Command GenerationSend the robot to the assembly stationnavigate, goto, position, cmd_vel, serviceros2 service call /goto_position
Table 8. Overall model ranking based on the weighted composite benchmark score.
Table 8. Overall model ranking based on the weighted composite benchmark score.
RankModelQualityCoverageSpeedComposite
1Gemma3-qat85.5748.5815.9962.72
2Gemma384.6147.6216.0362.11
3SmolLM262.514.4886.0458.42
4Llama 3.280.7634.7418.3557.36
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pavel, M.-D.; Stamatescu, G.; Chodnicki, M.; Amza, C.G. LLM-Enhanced Control of a Mobile Robotic Platform for Smart Industry. Appl. Sci. 2026, 16, 1680. https://doi.org/10.3390/app16041680

AMA Style

Pavel M-D, Stamatescu G, Chodnicki M, Amza CG. LLM-Enhanced Control of a Mobile Robotic Platform for Smart Industry. Applied Sciences. 2026; 16(4):1680. https://doi.org/10.3390/app16041680

Chicago/Turabian Style

Pavel, Mihai-Daniel, Grigore Stamatescu, Marek Chodnicki, and Catalin Gheorghe Amza. 2026. "LLM-Enhanced Control of a Mobile Robotic Platform for Smart Industry" Applied Sciences 16, no. 4: 1680. https://doi.org/10.3390/app16041680

APA Style

Pavel, M.-D., Stamatescu, G., Chodnicki, M., & Amza, C. G. (2026). LLM-Enhanced Control of a Mobile Robotic Platform for Smart Industry. Applied Sciences, 16(4), 1680. https://doi.org/10.3390/app16041680

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop