Application of Large Language Models for Detecting Semantic Ambiguity in Industrial Instructions: Impact on Human–Machine Interaction and User Experience in Process Automation Systems of a Metallurgical Plant

Vedeneev, Viktor A.; Kondratiev, Viktor V.; Suslov, Konstantin V.; Kononenko, Roman V.; Govorkov, Aleksey S.; Gladkikh, Vitaliy A.; Karlina, Yulia I.; Karlina, Antonina I.

doi:10.3390/automation7040104

Open AccessArticle

Application of Large Language Models for Detecting Semantic Ambiguity in Industrial Instructions: Impact on Human–Machine Interaction and User Experience in Process Automation Systems of a Metallurgical Plant

by

Viktor A. Vedeneev

¹,

Viktor V. Kondratiev

^1,2,

Konstantin V. Suslov

¹

,

Roman V. Kononenko

³

,

Aleksey S. Govorkov

³

,

Vitaliy A. Gladkikh

⁴

,

Yulia I. Karlina

⁴

and

Antonina I. Karlina

^4,*

¹

Advanced Engineering School, Cherepovets State University, Lunacharsky Street, 5, 126600 Cherepovets, Russia

²

Innovation and Technology Center for Energy and Resource Conservation, A. P. Vinogradov Institute of Geochemistry of the Siberian Branch of the Russian Academy of Sciences, 664033 Irkutsk, Russia

³

Institute of Information Technology and Data Science, Irkutsk National Research Technical University, 664074 Irkutsk, Russia

⁴

Scientific Research and Testing Center “Stroytest”, Moscow State University of Civil Engineering, 129337 Moscow, Russia

^*

Author to whom correspondence should be addressed.

Automation 2026, 7(4), 104; https://doi.org/10.3390/automation7040104 (registering DOI)

Submission received: 10 May 2026 / Revised: 22 June 2026 / Accepted: 2 July 2026 / Published: 5 July 2026

Download

Browse Figures

Versions Notes

Abstract

In the context of industrial digitalization and the widespread adoption of process automation systems, Knowledge Management Systems (KMS) play a key role in providing operational personnel with up-to-date instructions and regulations. However, the inherent ambiguity of natural language in technical documentation remains a serious obstacle, leading to incorrect operator actions, process deviations, and increased safety risks. This article investigates the integration of Large Language Models (LLMs) into KMS and its impact on user experience and human–machine interaction in industrial automation environments. A method called Semantic Latent Choice Detection is presented, designed to systematically identify interpretation ambiguities in process instructions and operator commands. Unlike existing approaches that require access to the internal model architecture (“white box”) or token-level logits, the proposed method is logit-free and operates with closed commercial LLMs (“black box”) via standard API interfaces. The method analyzes the semantic similarity of binary text blocks and polysemous terms within the context of a specific technological process. Using a metallurgical production case study, we demonstrate how the system detects hidden semantic collisions (e.g., the difference between “adding ferroalloys into the ladle” and “feeding ferroalloys onto the conveyor”) that are missed by traditional rule-based validation methods. Instead of arbitrarily selecting an interpretation, the system initiates a clarification request to the human operator, thereby reducing cognitive load, preventing erroneous automated decisions, and increasing trust in the KMS. An empirical evaluation conducted in a real-world industrial setting (unit control rooms and dispatch centers) shows a statistically significant reduction in errors related to misinterpretation of process regulations. The article contributes to the fields of automation engineering, knowledge management, and human-centered automation by proposing a novel method for validating operational instructions in high-risk industrial environments.

Keywords:

automation systems; knowledge management; large language models; semantic ambiguity; human–machine interaction; user experience; metallurgical process control; industrial safety; logit-free analysis; clarification interface

1. Introduction

Modern industrial enterprises, particularly metallurgical plants, operate as complex cyber–physical systems whose efficiency, product quality, and operational safety critically depend on strict adherence to technological instructions and process regulations. Knowledge Management Systems (KMS) aggregate accumulated operational experience and provide it to personnel in real time, maintaining a unified repository of normative and technical documentation, standard operating procedures (SOPs), and emergency protocols [1,2]. However, traditional KMS often encounter the fundamental problem of “tacit knowledge” and the semantic ambiguity of formalized natural-language texts—a challenge that becomes particularly acute when such texts are used as input to automated decision-making systems.

This problem is especially pronounced in metallurgical process automation, where the same lexical construction can carry fundamentally different meanings depending on the operational context. For example, the term “furnace” may refer to a blast furnace, an electric arc furnace, or a ladle furnace; the verb “supply” can denote gas injection, charge material feeding, or a command to move a mechanical actuator. Such ambiguity in process instructions leads to non-compliance or incorrect execution, increasing the risk of accidents, product defects, equipment damage, and occupational injuries [3]. In the context of automated process control systems (ASU TP), ambiguous commands that are fed directly into control loops without human supervision pose an even greater threat.

With the advent of Large Language Models (LLMs), new opportunities have emerged for the semantic analysis of corporate knowledge and natural-language instructions. Recent research demonstrates successful applications of LLMs for information extraction from technical documents, question answering, and operator support [4,5]. However, when integrating LLMs into industrial automation and human-in-the-loop control systems, a fundamental problem arises: modern LLMs are generative by nature and tend to produce the most statistically probable answer rather than the uniquely correct one. When confronted with ambiguity in a command (e.g., “Cool the melting zone”—by water injection or by switching off the heaters?), the LLM may arbitrarily select one interpretation without notifying the operator of the existence of alternatives. Such behavior is unacceptable in high-risk automation environments where deterministic responses, strict safety constraints, and predictable human–machine interaction are mandatory.

To solve this problem, we propose a method called Semantic Latent Choice Detection. The novelty of the method lies in:

Focus on latent ambiguity: The method is specifically designed to identify situations where the text of an instruction or operator command has several semantically valid but mutually exclusive interpretations within the given context;
Logit-free architecture: Unlike methods that analyze the probability distribution of the next token (logits), our approach works with the LLM as a “black box” via an API. This makes it applicable to the most modern proprietary models (e.g., GPT-4, Llama, as well as closed corporate models);
Binary semantic analysis: We focus on analyzing binary text blocks and term homonymy in specific technological chains, which allows us to identify collisions that are not visible during superficial checks.

The aim of this work is to evaluate the impact of the proposed semantic ambiguity detection method on the user experience and effectiveness of human–machine interaction for operational personnel. Specifically, we investigate how reducing semantic uncertainty in process instructions affects command execution speed, error rates, cognitive load, and operator satisfaction with the KMS interface—key metrics for human-centered automation.

The article is organized as follows. Section 2 reviews related work in process automation, knowledge management, ambiguity in technical texts, and human–machine interfaces. Section 3 describes the developed method and system architecture in detail, including formal definitions and algorithms. Section 4 presents the experimental evaluation conducted in a real metallurgical production environment (process control systems and control rooms). Section 5 discusses the results, compares them with alternative approaches, and addresses limitations. Section 6 concludes the article and outlines directions for future research.

2. Literature Review

2.1. Knowledge Management and Process Automation in Industry

The concept of Knowledge Management in industrial automation has been actively developed since the end of the 20th century [6]. The key task is to transform individual employee expertise into formalized, machine-processable knowledge accessible across the enterprise, enabling more intelligent and reliable automation. In metallurgical process automation, where technological processes are complex, time-critical, and potentially dangerous, Standard Operating Procedures (SOPs) and occupational safety instructions acquire particular importance [7]. Research indicates that up to 30% of industrial incidents are associated with incorrect understanding or misinterpretation of written instructions [8]. In automated systems, such misinterpretations can propagate through control loops with severe consequences.

2.2. The Problem of Ambiguity in Technical Texts and Its Impact on Automation

Ambiguity is an inherent property of natural language that poses significant challenges when technical documentation is used in automated or semi-automated decision-making pipelines. In industrial contexts, ambiguity manifests in several forms:

Lexical homonymy: One word has multiple meanings (e.g., “key”—a tool or a solution to a problem);
Syntactic ambiguity: One phrase can be structured differently (e.g., “inspection of the workshop by the foreman”);
Pragmatic ambiguity: Unclear authorial intent.

To reduce ambiguity, methods of Controlled Natural Language and standardized templates have been proposed [9,10,11,12]. However, these approaches are not always effective for dynamic operational contexts, as even formalized texts can be perceived differently by operators depending on the situation, and they do not eliminate ambiguity in legacy documentation.

2.3. Application of LLMs in Industrial Automation and Knowledge Processing

Large Language Models are increasingly being integrated into industrial automation and knowledge processing pipelines. Recent work on industrial automation has proposed LLM-based agents that interpret events from automation systems and support natural-language interaction with production processes [13,14,15,16]. Reviews of industrial LLM applications also emphasize that the main deployment risk is not only model accuracy, but the safe coupling of LLM-generated text with equipment states, control actions, and human supervision [17,18]. In manufacturing and robotics, LLM agents have been investigated as interfaces for task planning and decision support, but these systems require additional safety layers when natural-language commands can be mapped to physical actions [19].

A closely related research stream concerns ambiguity detection in requirements and technical specifications. LLM-based approaches have been used to classify and explain ambiguous requirements, including industrial studies where in-context prompting improves the detection of ambiguous language [20,21]. Another line of work estimates semantic uncertainty in free-form LLM answers by grouping semantically equivalent responses and comparing meaning-level alternatives [13]. These studies motivate the use of LLMs for ambiguity analysis, but they typically focus on software requirements, general question answering, or model uncertainty. In contrast, the present work focuses on industrial instructions in metallurgical process automation, where the relevant output is a clarification request grounded in the current process context, not a rewritten requirement or a model confidence score.

2.4. User Experience and Human–Machine Interaction in Industrial Automation

User experience in industrial automation systems is increasingly recognized as a critical factor in system performance and safety [14]. Key metrics include task completion time, error rate, cognitive load, and operator trust in the system [15]. Journals such as Automation have dedicated special issues to “Enhancing User Experience in Automation and Control Systems,” emphasizing that “user experience, so far widely discussed in pure software engineering, will enter new areas of home and industrial automation systems” [20]. The integration of intelligent agents and LLM-based assistants should improve these metrics, not degrade them through unexpected “hallucinations” or incorrect interpretations. Our work directly addresses this need by introducing a method that explicitly warns about ambiguity rather than silently selecting an interpretation.

3. Method of Semantic Latent Choice Detection

3.1. Overall Architecture

The proposed semantic ambiguity detection system consists of three main modules, designed for seamless integration into existing industrial process automation and KMS infrastructures (Figure 1):

Preprocessing and Contextualization Module: Receives as input the text of an instruction or operator command, as well as the current process context (e.g., unit identifier, shift task, real-time sensor data from SCADA). The context can be represented either as structured numerical data or as a textual description enriched from the enterprise knowledge base;
Semantic Analysis Module (LLM): Uses an external language model (via API) to perform a series of targeted queries aimed at identifying all plausible interpretations of the input text within the given process context;
Decision and Clarification Module: Based on the LLM’s responses, determines whether semantic ambiguity exists and, if so, generates a structured clarification request for the operator. This module also logs detected ambiguity cases for system improvement and safety reporting.

A comprehensive block diagram showing the complete workflow of the proposed method is presented in Figure 1.

Figure 1. Block diagram of the Semantic Latent Choice Detection method. The workflow separates text/context preprocessing, black-box LLM semantic probing, and the final clarification decision made before any command is transferred to the execution system.

Importantly, the analysis module does not require the LLM to select the “most likely” interpretation. Instead, it requests a list of all possible interpretations and evaluates their semantic proximity to objects, actions, and process states present in the enterprise’s knowledge base. This design prevents the LLM from making autonomous, potentially dangerous decisions.

3.2. Formal Definitions of Semantic Ambiguity in the Automation Context

We formalize ambiguity at the term level. Let T denote the text of a command or instruction fragment, W(T) the set of extracted terms and collocations, C the current process context, and K the corporate knowledge base containing equipment identifiers, material classes, procedures, and permissible process actions. For a term w, let I(w) be the set of all its possible interpretations in the subject domain. Each interpretation i in I(w) corresponds to a domain entity Φ(i) in K. We define the set of all entities associated with term w as:

E (w) ⋃_{i \in I (w)} Φ (i),

(1)

A term w is considered ambiguous if E(w) contains at least two different entities belonging to different classes or leading to different technological operations:

A m b (w) = 1 (|E (w)| \geq 2),

(2)

The entire text T (a command or instruction fragment) is considered ambiguous if there exists at least one term w∈W for which Amb(w) = 1:

A m b (T, C, K) = \max_{w \in W} A m b (w),

(3)

For example, consider the instruction fragment “Supply gas to the tuyeres.” In this case, the extracted terms include supply, gas, and tuyeres. For the term combination gas to tuyeres, the set of possible interpretations I(w) may include oxygen supply to converter tuyeres and argon supply to ladle-stirring tuyeres. These interpretations are mapped to different entities and technological actions in K. Therefore, E(w) contains more than one feasible entity-action pair, and Amb(w) = 1. Since at least one term or collocation in T satisfies this condition, the whole instruction is marked as ambiguous according to Equation (3).

For binary constructs (e.g., conjunctions “or,” “either”), ambiguity can be flagged without analyzing individual terms if both options are permissible in the current context. For this purpose, we introduce an additional function Binary(T) that extracts fragments of the form “X or Y” and evaluates the joint feasibility of both alternatives given the current process state.

3.3. Context-Aware Relevance Filtering

To improve accuracy and reduce false positives, we incorporate a context-aware relevance function. Let

r e l (e, C) \in [0,1],

(4)

and assess how well entity e corresponds to the current process context C (e.g., whether the equipment is currently active, part of the shift task, or has active sensors). Then the relevance-filtered set of entities for term w is:

E_{r e l} (w) = {e \in E (w) ∣ r e l (e, C) \geq δ}

(5)

where δ is a configurable relevance threshold. The context-sensitive ambiguity condition becomes:

{A m b}_{r e l} (w) = 1 (E_{r e l} (w) \geq 2) .

(6)

This filtering mechanism significantly reduces false positives in dynamic production environments where many domain entities are not currently active.

3.4. The Concept of “Latent Choice”

By latent choice, we mean a situation where the textual input is semantically compatible with multiple distinct domain entities, and these entities belong to different semantic classes or lead to different, potentially incompatible, actions. For example, in the instruction “Add flux,” the term “flux” may refer to lime, fluorspar, or other additives, each with different proportions and addition methods.

To identify a latent choice, we execute the following algorithmic procedure:

Step 1: Extract key terms (nouns, verbs, and their collocations) from the input text.
Step 2: For each extracted term, formulate a query to the LLM: “List all possible meanings and interpretations of the term [X] in the context of metallurgical production (blast furnace shop, steelmaking shop, etc.).
Step 3: Map the returned list of interpretations to concrete domain objects from the corporate knowledge base (ontologies, equipment directories, material reference data, operational procedures).
Step 4: If a single lexical unit maps to two or more distinct objects from different categories (or with different attribute sets), flag the text as containing a potential latent choice.

A detailed flowchart of this algorithm is presented in Figure 2.

3.5. Prompt-Based Semantic Probing

Unlike approaches that require access to internal next-token probability distributions, our approach relies exclusively on natural-language prompts with explicit instructions. In this paper, the term “logit-free” therefore means that the system does not inspect token logits, token probabilities, hidden states, or model weights. It uses tokens only in the ordinary API sense: the model receives a text prompt and returns a text answer. A representative prompt template is:

“You are an expert in metallurgical process automation. Analyze the following operator command: ‘[command text]’. Considering the current process context: ‘[context description]’. List all possible interpretations of this command that correspond to distinct real objects, actions, or process states in the production facility. If only one interpretation exists, respond with ‘UNAMBIGUOUS’. If multiple interpretations exist, list each with a brief justification of why they differ.”

This prompt-based methodology can be implemented using any LLM that supports instruction following via API, without requiring access to logits or other internal model parameters.

3.6. Binary Text Blocks and Mutually Exclusive Alternatives

Binary text blocks are fragments in which the instruction explicitly or implicitly contains alternative execution paths. Typical markers include “or,” “either,” “depending on,” and “if needed.” Such fragments are not treated as ambiguous merely because they contain a conjunction; they are flagged only when more than one alternative is technically permissible in the current context C and the alternatives lead to different actions, equipment objects, or safety consequences.

“If the temperature exceeds 1600 °C, reduce power or increase coolant flow.”

Contains two possible actions that may be mutually exclusive depending on the metallurgical situation. The system checks K to determine whether both actions are allowed for the current operating mode and then checks C to determine whether both are feasible at the moment. If both alternatives remain feasible after this filtering, the instruction is not executed automatically; instead, the interface asks the operator to select the intended action. This distinguishes binary semantic analysis from ordinary term homonymy: the ambiguity is caused by a branch in the action logic, not by a single polysemous word.

3.7. Integration with Knowledge Management and Process Control Systems

The proposed system is embedded into the enterprise’s existing KMS and process automation infrastructure in two operational modes. In the off-line document validation mode, a newly created or edited instruction is automatically analyzed before approval, and the process specialist receives a structured report of detected ambiguous fragments. In the real-time operator interface mode, the same method is applied to commands entered through the HMI, a command line, or a voice interface. If ambiguity is detected, the HMI displays a clarification dialog with the identified interpretation options and waits for the operator’s selection before the command can proceed.

The clarification step implements a human-in-the-loop mechanism rather than an automatic override. When several feasible interpretations remain after contextual filtering, the operator selects the intended interpretation, and this selection is stored together with the command text, process context, and final execution decision. The collected feedback is used to update ambiguity reports, improve the process-command vocabulary, and support later expert review. However, operator feedback does not automatically modify safety rules or permissible-action lists; such updates require validation by technologists or safety specialists.

This design prevents the execution of ambiguous commands, trains personnel to formulate queries more precisely over time, and builds trust in the automation system.

4. Experimental Evaluation in a Real-World Metallurgical Process Automation Environment

4.1. Description of the Production Site and Experimental Setup

The study was conducted at a metallurgical plant in Russia (name anonymized for confidentiality). The experimental setup involved two operators from control rooms, two technologists responsible for developing and maintaining operational instructions, and the existing KMS based on a corporate documentation portal integrated with an equipment database and SCADA system. This small but operationally realistic configuration was selected to validate whether the proposed assistant can be inserted into an active production workflow without changing the logic of the underlying automation system.

The LLM was accessed through a standard text-generation API using the same prompt template for all corpus items. The model was used in a deterministic or near-deterministic configuration with a low temperature setting to reduce variation between repeated runs. No token-level probabilities, logits, hidden states, or model weights were accessed during the experiment. The API returned only natural-language candidate interpretations, which were subsequently mapped to the corporate knowledge base and evaluated by the decision module.

The experimental environment was a real, operational production facility. Operators were informed that a new assistant interface was being tested but were not told the specific details of the experiment to avoid Hawthorne effects. A second control group of operators continued their normal shift operations without the assistant.

4.2. Experimental Stages

Stage 1 collected a corpus of 400 text items: 260 production instructions, 100 logs of operational commands entered into the ASU TP system over one month, and 40 fragments from occupational safety instructions relevant to process operations.

In addition, an initial process-command vocabulary was constructed from the collected corpus. The vocabulary included equipment names, material names, action verbs, process states, safety-related terms, and common collocations used by operators. Examples of vocabulary entries include ladle, tundish, tuyere, argon flow, coolant, supply, increase, reduce, level, and temperature threshold. This vocabulary was not treated as a complete ontology; rather, it served as a practical lexical layer for mapping LLM-proposed interpretations to entities and actions stored in the corporate knowledge base.

Stage 2 consisted of ambiguity labeling by two plant technologists. They independently annotated all texts for potentially ambiguous passages, and the initial consensus labeling identified 100 fragments (25% of the corpus) as genuinely ambiguous.

Stage 3 applied the proposed Semantic Latent Choice Detection method. The system flagged 92 fragments as containing latent choices. Of these, 82 matched the initial expert consensus labels. The remaining 10 fragments were initially counted as false positives, but after adjudication by both technologists they were also confirmed as ambiguous. To make the evaluation transparent, we report both the strict metrics against the initial consensus and the adjudicated interpretation:

Strict recall against the initial consensus was 82/(82 + 18) = 82%.

Strict precision against the initial consensus was 82/(82 + 10) = 89.1%. After expert adjudication, the 10 additional cases were reclassified as valid ambiguous fragments, so they are discussed as newly discovered ambiguities rather than confirmed false alarms.

Stage 4: Integration into live process automation workflow.

In a control room of one production unit (the experimental group), a modified HMI version incorporating the intelligent ambiguity detection assistant was deployed for two consecutive weeks. Operations in a second control room continued unchanged (control group).

Stage 5 collected user experience and automation effectiveness metrics for both groups: misinterpretation-related error frequency, command execution time, clarification requests to the dispatcher or technologist, and subjective satisfaction measured by a 5-point Likert scale.

Table 1 presents representative examples of module-level outputs for two cases from the corpus. These examples show how the three-module architecture transforms an input instruction into either a clarification request or an unambiguous decision, addressing the question of how the system works in practice.

Before annotation, the corpus was normalized by removing exact duplicates, unifying spelling variants of equipment names, and preserving repeated commands only when they appeared in different operational contexts. Approximately 12% of the command-log entries were repeated or near-repeated formulations with minor lexical changes, such as different equipment identifiers or process-state descriptions. These repetitions were retained because, in process automation, the same linguistic command may become ambiguous or unambiguous depending on the active unit, material flow, or safety state.

4.3. Detailed Results

Table 2 summarizes the comparative performance between the experimental group (with the proposed assistant) and the control group (without the assistant).

4.4. Typology of Detected Ambiguities and System Responses

The system successfully identified several different types of semantic ambiguity, as shown in Table 3.

5. Discussion

5.1. Interpretation of Results

The experimental results strongly support our hypothesis that systematic semantic ambiguity detection using LLMs, integrated with appropriate feedback mechanisms, significantly reduces operational errors in process automation environments. The 77% reduction in misinterpretation-related errors represents a substantial improvement in the context of high-risk metallurgical production, where even a single error can lead to equipment damage, product loss, or safety incidents.

5.2. Comparison with Alternative Approaches

The observed 18% increase in average command execution time is a trade-off that operators consistently accepted as favorable given the enhanced safety and reduced cognitive load (reflected in the 74% drop in dispatcher consultations). Post-experiment surveys revealed that operators valued the system’s ability to “make the implicit explicit,” and several noted that the clarification prompts helped them become more aware of ambiguities in their own daily communication (Table 4).

5.3. Study Limitations

Despite the positive results, several limitations must be acknowledged. First, the experiment was conducted in two workshops of one metallurgical plant over a two-week period, so broader validation across multiple enterprises, production units, and longer time horizons is required before claiming full generalizability. Second, the accuracy of ambiguity detection depends on the capabilities of the LLM used. The experiment employed a medium-sized commercial model comparable to GPT-3.5; larger models may improve recall and precision but would increase inference cost. Third, the method was developed for heavy industrial metallurgical operations, and adaptation to chemical, pharmaceutical, or energy production would require customization of the knowledge base, terminology, and contextual integration. Finally, the current implementation uses API calls to an external LLM, which introduces latency. This is acceptable for human-in-the-loop command validation, but deterministic real-time control loops would require on-premise deployment, local caching, or pre-validation of instruction libraries.

5.4. Practical Significance and Implications for Automation Practice

The proposed method offers practical value for industrial enterprises where complex operational instructions are used and the cost of misinterpretation is high. Its main implementation advantage is that it can be integrated with existing KMS and process automation systems through standard APIs without retraining the LLM or exposing internal model parameters. The method is also vendor-agnostic because it requires only instruction-following behavior and text output. In operational practice, the clarification interface has a secondary training effect: operators and technologists gradually learn to formulate commands and instructions more precisely, which improves documentation quality, shift handovers, and safety briefings even outside the assistant interface.

Furthermore, the system contributes to a broader culture of precision in operational communication, which has positive spillover effects on documentation quality, shift handovers, and safety briefings.

6. Conclusions and Future Directions

This paper presented a Semantic Latent Choice Detection method for identifying semantic ambiguity in industrial process instructions and operator commands using Large Language Models without requiring access to internal model parameters or logits. The method features a logit-free, API-compatible design, binary semantic analysis of technical texts, and a clarification-oriented decision mechanism that preserves the human operator’s authority in the control loop.

Experimental validation at a real metallurgical plant demonstrated that the proposed assistant can improve the safety and usability of operator interaction without changing the underlying automation logic. During the two-week evaluation, misinterpretation-related errors decreased by 77%, dispatcher clarification requests decreased by 74%, and the average HMI satisfaction score increased from 3.8 to 4.6 out of 5. Although the clarification dialog increased average command execution time by 18%, this delay was limited to cases where ambiguity was detected and was accepted by operators as a reasonable trade-off for safer command execution.

Future work will therefore focus on extending the same human-in-the-loop principle rather than replacing operator judgment. The next development stage will include multimodal processing of voice commands, deeper integration of real-time SCADA and IIoT signals for context-aware filtering, and adaptive clarification generation based on previously resolved cases. Longer longitudinal studies across several production units are also needed to evaluate the effect on safety KPIs such as near-miss frequency and reportable incidents. For critical industrial environments, on-premise deployment of lightweight LLM variants will be investigated to reduce latency and satisfy data sovereignty requirements.

Author Contributions

Conceptualization, V.A.V. and K.V.S.; Methodology, K.V.S.; Software, R.V.K. and Y.I.K.; Validation, R.V.K.; Formal Analysis, K.V.S., A.S.G. and V.A.G.; Investigation, V.V.K. and A.S.G.; Resources, V.V.K. and A.S.G.; Data Curation, V.A.V., R.V.K. and A.S.G.; Writing—Original Draft, V.A.V.; Writing—Review and Editing, V.A.V. and A.I.K.; Visualization, V.V.K., R.V.K., V.A.G. and Y.I.K.; Supervision, V.A.G., Y.I.K. and A.I.K.; Project Administration, Y.I.K. and A.I.K.; Funding Acquisition, A.I.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions of this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

API	Application Programming Interface
ASU TP	Automated Process Control System (Avtomatizirovannaya Sistema Upravleniya Tekhnologicheskim Protsessom)
CCM	Continuous Casting Machine
HMI	Human–Machine Interface
IIoT	Industrial Internet of Things
KMS	Knowledge Management Systems
LLM	Large Language Model(s)
MNS	Machine-Building Plant (Metallurgicheskiy Zavod-context-specific)
SCADA	Supervisory Control and Data Acquisition
SOP	Standard Operating Procedure(s)
UX	User Experience

References

Nonaka, I.; Takeuchi, H. The Knowledge Creating Company; Oxford University Press: New York, NY, USA, 1995. [Google Scholar]
Davenport, T.H.; Prusak, L. Working Knowledge: How Organizations Manage What They Know; Harvard Business School Press: Boston, MA, USA, 1998. [Google Scholar]
Reason, J. Human Error; Cambridge University Press: Cambridge, UK, 1990. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is All You Need. In Advances in Neural Information Processing Systems 30 (NIPS 2017); Curran Associates, Inc.: Red Hook, NY, USA, 2017; pp. 5998–6008. [Google Scholar]
Bommasani, R.; Hudson, D.A.; Adeli, E.; Altman, R.; Arora, S.; von Arx, S.; Bernstein, M.S.; Bohg, J.; Bosselut, A.; Brunskill, E. On the Opportunities and Risks of Foundation Models. arXiv 2021, arXiv:2108.07258. [Google Scholar]
Wiig, K.M. Knowledge Management: An Introduction and Perspective. J. Knowl. Manag. 1997, 1, 6–14. [Google Scholar] [CrossRef]
Hale, A.; Borys, D. Working to rule, or working safely? Part 1: A state of the art review. Saf. Sci. 2013, 55, 207–221. [Google Scholar] [CrossRef]
Embrey, D. Preventing human error in process safety. Chem. Eng. Prog. 2000, 96, 37–44. [Google Scholar]
Kuhn, T. A Survey and Classification of Controlled Natural Languages. Comput. Linguist. 2014, 40, 121–170. [Google Scholar] [CrossRef]
Carlini, N.; Tramer, F.; Wallace, E.; Jagielski, M.; Herbert-Voss, A.; Lee, K.; Roberts, A.; Brown, T.; Song, D.; Erlingsson, U.; et al. Extracting Knowledge from Industrial Documents using Large Language Models. IEEE Trans. Ind. Inform. 2023, 19, 5123–5132. [Google Scholar] [CrossRef]
Ershov, V.A.; Kondratiev, V.V.; Karlina, A.I.; Kolosov, A.D.; Sysoev, I.A. Selection of control system parameters for production of nanostructures concentrates. J. Phys. Conf. Ser. 2018, 1118. [Google Scholar] [CrossRef]
Kondratiev, V.V.; Nebogin, S.A.; Sysoev, I.A.; Gorovoy, V.O.; Karlina, A.I. Description of the test stand for developing of technological operation of nano-dispersed dust preliminary coagulation. Int. J. Appl. Eng. Res. 2017, 12, 12809–12813. [Google Scholar]
Kuhn, L.; Gal, Y.; Farquhar, S. Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation. In Proceedings of the 11th International Conference on Learning Representations (ICLR 2023), Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
Neumann, W.P.; Winkelhaus, S.; Grosse, E.H.; Glock, C.H. Industry 4.0 and the human factor—A systems framework and analysis methodology for successful development. Int. J. Prod. Econ. 2021, 233, 107992. [Google Scholar] [CrossRef]
Hancock, P.A.; Billings, D.R.; Schaefer, K.E.; Chen, J.Y.C.; de Visser, E.J.; Parasuraman, R. A meta-analysis of factors affecting trust in human-robot interaction. Hum. Factors 2011, 53, 517–527. [Google Scholar] [CrossRef] [PubMed]
Xia, Y.; Jazdi, N.; Zhang, J.; Shah, C.; Weyrich, M. Control Industrial Automation System with Large Language Models. arXiv 2024, arXiv:2409.18009. [Google Scholar]
Xia, Y.; Jazdi, N.; Weyrich, M. Applying Large Language Models for Intelligent Industrial Automation. atp Mag. 2024, 66, 62–71. [Google Scholar] [CrossRef]
Bashir, S.; Ferrari, A.; Khan, A.; Strandberg, P.E.; Haider, Z.; Saadatmand, M.; Bohlin, M. Requirements Ambiguity Detection and Explanation with LLMs: An Industrial Study. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME), Auckland, New Zealand, 7–12 September 2025; pp. 620–631. [Google Scholar] [CrossRef]
Chen, J.; He, J.; Chen, F.; Lv, Z.; Tang, J.; Li, W.; Liu, Z.; Yang, H.H.; Han, G. Towards General Industrial Intelligence: A Survey of Continual Large Models in Industrial IoT. arXiv 2024, arXiv:2409.01207. [Google Scholar]
Ouerghemmi, C.; Ertz, M. Integrating Large Language Models into Digital Manufacturing: A Systematic Review and Research Agenda. Computers 2025, 14, 318. [Google Scholar] [CrossRef]
Alhoshan, W.; Zhao, L.; Ferrari, A.; Letsholo, K.J. Enhancing Software Requirements Quality: Ambiguity Detection and Resolution Using Large Language Models. In Computational Science and Computational Intelligence; Springer: Berlin/Heidelberg, Germany, 2025; pp. 340–355. [Google Scholar] [CrossRef]

Figure 2. Flowchart of the latent choice identification algorithm as provided in the original manuscript.

Table 1. Representative module-level outputs and how the system resolves ambiguity in practice.

Input Fragment	Preprocessing/Contextualization Output	LLM Semantic Analysis Output	Decision Module Output
“Supply gas to the tuyeres.”	Key terms: supply, gas, tuyeres. Context: steelmaking unit; two gas subsystems active.	Interpretation 1: oxygen to converter tuyeres. Interpretation 2: argon to ladle stirring tuyeres.	Two feasible entities/actions remain after filtering; ambiguity flag = 1.
“Check the level in the ladle.”	Key terms: level, ladle. Context: steel ladle and tundish are active in shift assignment.	Interpretation 1: steel ladle level. Interpretation 2: hot-metal ladle level. Interpretation 3: tundish level.	Context removes inactive hot-metal ladle, but two feasible objects remain.

Table 2. Comparison of indicators between experimental and control groups.

Indicator	Experimental Group (with Assistant)	Control Group (Without Assistant)	Change
Number of misinterpretation-related errors (2 weeks)	2	9	↓ 77%
Average command execution time, sec	45 (including clarification dialogs)	38	↑ 18%
Clarification requests to dispatcher (avg per shift)	1.2	4.7	↓ 74%
Interface satisfaction (average rating 1–5)	4.6	3.8	↑ 0.8

Note on execution time: The 18% increase in the experimental group is attributed to the additional time spent on clarification dialogs when ambiguity was detected. However, this increase was offset by the prevention of errors that, in the control group, often led to longer unplanned troubleshooting activities (not reflected in the average command execution time, which only captures successfully completed operations without incident follow-up). Qualitative interviews with operators confirmed that the clarifications “added only a few seconds” but provided “critical safety assurance.”.

Table 3. The system successfully identified several different types of semantic ambiguity.

Type of Ambiguity	Example	Detected Interpretations	System Response
Lexical homonymy	“Supply gas to the tuyeres.”	(1) Oxygen to converter tuyeres; (2) Argon to ladle stirring tuyeres	Prompt asking for gas type and tuyere identification
Entity ambiguity	“Check the level in the ladle.”	(1) Steel ladle; (2) Hot metal ladle; (3) Tundish	Selection dialog based on active shift assignment
Action ambiguity	“Increase argon flow.”	(1) Ladle stirring; (2) Stream protection; (3) Tundish mixing	Cross-check with SCADA to identify active processes; if multiple active → warning
Binary construct	“If temperature > 1600 °C, reduce power or increase coolant.”	(1) Reduce power; (2) Increase coolant	Request to specify which action is intended

Table 4. The system information.

Approach	Strengths	Limitations	Focus
Traditional expert review	High quality for obvious ambiguities	Limited coverage; fatigue; missed latent ambiguities	Any
Rule-based (regex, ontologies)	Fast, deterministic	High setup effort; poor generalization to novel phrasing	Industrial instructions
Logit-based LLM ambiguity detection [13]	Leverages model’s internal uncertainty estimates	Requires white-box access; impossible for closed APIs	Any text
Proposed logit-free method	Works with any API; no internal access needed; prompt-based	Depends on LLM output quality	Industrial instructions
RAG-based vocabulary retrieval	Retrieves relevant terms and documents from a domain vocabulary or knowledge base	Does not by itself determine whether alternatives are mutually exclusive or require operator clarification	Industrial instructions and knowledge-base search

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Vedeneev, V.A.; Kondratiev, V.V.; Suslov, K.V.; Kononenko, R.V.; Govorkov, A.S.; Gladkikh, V.A.; Karlina, Y.I.; Karlina, A.I. Application of Large Language Models for Detecting Semantic Ambiguity in Industrial Instructions: Impact on Human–Machine Interaction and User Experience in Process Automation Systems of a Metallurgical Plant. Automation 2026, 7, 104. https://doi.org/10.3390/automation7040104

AMA Style

Vedeneev VA, Kondratiev VV, Suslov KV, Kononenko RV, Govorkov AS, Gladkikh VA, Karlina YI, Karlina AI. Application of Large Language Models for Detecting Semantic Ambiguity in Industrial Instructions: Impact on Human–Machine Interaction and User Experience in Process Automation Systems of a Metallurgical Plant. Automation. 2026; 7(4):104. https://doi.org/10.3390/automation7040104

Chicago/Turabian Style

Vedeneev, Viktor A., Viktor V. Kondratiev, Konstantin V. Suslov, Roman V. Kononenko, Aleksey S. Govorkov, Vitaliy A. Gladkikh, Yulia I. Karlina, and Antonina I. Karlina. 2026. "Application of Large Language Models for Detecting Semantic Ambiguity in Industrial Instructions: Impact on Human–Machine Interaction and User Experience in Process Automation Systems of a Metallurgical Plant" Automation 7, no. 4: 104. https://doi.org/10.3390/automation7040104

APA Style

Vedeneev, V. A., Kondratiev, V. V., Suslov, K. V., Kononenko, R. V., Govorkov, A. S., Gladkikh, V. A., Karlina, Y. I., & Karlina, A. I. (2026). Application of Large Language Models for Detecting Semantic Ambiguity in Industrial Instructions: Impact on Human–Machine Interaction and User Experience in Process Automation Systems of a Metallurgical Plant. Automation, 7(4), 104. https://doi.org/10.3390/automation7040104

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Application of Large Language Models for Detecting Semantic Ambiguity in Industrial Instructions: Impact on Human–Machine Interaction and User Experience in Process Automation Systems of a Metallurgical Plant

Abstract

1. Introduction

2. Literature Review

2.1. Knowledge Management and Process Automation in Industry

2.2. The Problem of Ambiguity in Technical Texts and Its Impact on Automation

2.3. Application of LLMs in Industrial Automation and Knowledge Processing

2.4. User Experience and Human–Machine Interaction in Industrial Automation

3. Method of Semantic Latent Choice Detection

3.1. Overall Architecture

3.2. Formal Definitions of Semantic Ambiguity in the Automation Context

3.3. Context-Aware Relevance Filtering

3.4. The Concept of “Latent Choice”

3.5. Prompt-Based Semantic Probing

3.6. Binary Text Blocks and Mutually Exclusive Alternatives

3.7. Integration with Knowledge Management and Process Control Systems

4. Experimental Evaluation in a Real-World Metallurgical Process Automation Environment

4.1. Description of the Production Site and Experimental Setup

4.2. Experimental Stages

4.3. Detailed Results

4.4. Typology of Detected Ambiguities and System Responses

5. Discussion

5.1. Interpretation of Results

5.2. Comparison with Alternative Approaches

5.3. Study Limitations

5.4. Practical Significance and Implications for Automation Practice

6. Conclusions and Future Directions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI