Next Article in Journal
ESG: Resource or Burden? Evidence from Chinese Listed Firms with Innovation Capability as the Mediating Mechanism
Previous Article in Journal
Weaving Cognition, Emotion, Service, and Society: Unpacking Chinese Consumers’ Behavioral Intention of GenAI-Supported Clothing Customization Services
Previous Article in Special Issue
Performance and Efficiency Gains of NPU-Based Servers over GPUs for AI Model Inference
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Conditional Entropy-Based Sequential Decision-Making for AI Adoption in Manufacturing: A Reinforcement Learning Approach

1
Department of Industrial & Management Systems Engineering, Kyung Hee University, Yongin-si 17104, Gyeonggi-do, Republic of Korea
2
Autonomous Manufacturing Research Center, Korea Electronics Technology Institute (KETI), Seongnam-si 13449, Gyeonggi-do, Republic of Korea
*
Author to whom correspondence should be addressed.
Systems 2025, 13(9), 830; https://doi.org/10.3390/systems13090830
Submission received: 14 August 2025 / Revised: 15 September 2025 / Accepted: 18 September 2025 / Published: 21 September 2025
(This article belongs to the Special Issue Data-Driven Analysis of Industrial Systems Using AI)

Abstract

Most small- and medium-sized manufacturers face challenges in adopting artificial intelligence (AI) in production systems due to limited domain expertise and challenges in making interrelated decisions. This decision-making process can be characterized as sequential decision-making (SDM), in which guidance on the decision order is valuable. This study proposes a data-driven SDM framework to identify an effective order of key decision elements for AI adoption, aiming to rapidly reduce uncertainty at each decision stage. The framework employs a Q-learning-based reinforcement learning approach, using conditional entropy as the reward function to quantify uncertainty. Based on a review of 55 studies applying AI to milling processes, the proposed model identifies the following decision order that minimizes cumulative uncertainty: sensor, data collection interval, data dimension, AI technique, data type, and data collection period. To validate the model, we conduct simulations of 4000 SDM episodes under rule-based constraints using the number of corrected episodes as a performance metric. Simulation results show that the proposed model generates decision orders with no corrections and that knowing the relative order between two elements is more effective than knowing exact positions. The proposed data-driven framework is broadly applicable and can be extended to AI adoption in other manufacturing domains.

1. Introduction

As the manufacturing industry faces increasing pressure to remain competitive amid technological advances, artificial intelligence (AI) is emerging as a key enabler of operational innovation and efficiency [1,2]. In particular, AI directly enhances flexibility and productivity within manufacturing processes, accelerating business growth [1,3]. However, the actual application of AI in manufacturing and its resulting process improvements are progressing more slowly than expected [4]. One of the possible reasons for this delay is that most manufacturers are small- and medium-sized enterprises (SMEs); for example, over 98% of manufacturers in Europe and the U.S. are SMEs [5,6]. Small- and medium-sized manufacturers (SMMs) commonly lack in-house AI expertise [7]. In many SMMs, senior executives—who are not AI experts—are often the ones making decisions regarding AI adoption. This practice not only slows down the decision-making process but also creates barriers to adoption itself [1]. Given that SMMs account for the overwhelming majority of manufacturers, SMMs’ successful AI-driven transformation is essential for advancing the manufacturing industry as a whole. Supporting SMMs to make more efficient AI adoption decisions is therefore critical to the future of manufacturing. While it might be theoretically possible to provide AI-related human and technical resources to every SMM in the form of governmental funding, such an approach is unrealistic. Alternatively, a more feasible and impactful strategy is to develop a decision support system tailored to the needs of SMM executives, the decision-makers. Especially, such a system can enable decision-makers without AI-related domain knowledge to effectively and efficiently evaluate key aspects of AI adoption and make appropriate decisions. Such a decision support system can also help capital-constrained SMMs adopt AI readily without hiring AI professionals [1,8].
Decision-making for AI adoption in manufacturing can be described as a form of sequential decision-making (SDM). SDM refers to the approach of making decisions step by step, particularly when multiple interrelated elements and their attributes must be determined [9,10]. In this context, elements are defined as distinct decision points that collectively comprise the overall decision-making process. For example, the decision-making process for AI adoption in manufacturing may include various elements such as sensor, data collection interval, and data type. Attributes refer to candidate options that can be selected for each element. For instance, the sensor attribute may include cameras, energy meters, and force sensors. Meanwhile, research studies systematically examining SDM problems regarding the AI-adopting process remain sparse. Although various SDM approaches have been applied in fields such as robotics, games, and control tasks, these studies typically focus on temporal action sequences [11]. However, such approaches are not well-suited for AI adoption in manufacturing, which requires handling logical interdependencies between elements. These logical interdependencies become evident when the selection of one element constrains the feasible choices for others. An effective and efficient way to address SDM problems is to determine an appropriate decision order that reduces uncertainty and narrows down the range of options early [10]. Although uncertainty reduction is one of the promising approaches for addressing SDM problems, it has rarely been explored systematically in prior studies. Thus, there is a clear need for a new approach that incorporates uncertainty reduction into the element ordering in SDM.
The strategy to set the budget first in a car-purchasing decision represents a typical case of SDM aimed at reducing uncertainty. In SDM, there is typically no predefined order of decision-making elements. Thus, the number of available choices is extensive. However, if the budget is specified first, the range of feasible brands, sizes, and models becomes significantly narrower. As such, once the decision order is defined, the number of combinations to be explored (i.e., the uncertainty) is significantly reduced, making the overall SDM process more efficient. This reduction in uncertainty becomes particularly important when the attributes within decision-making elements are interdependent [12]. Considering such interdependencies allows for a more informed decision order that minimizes uncertainty at each stage. Here, a stage denotes a single step in the SDM process during which one element is determined. The following example illustrates how interdependencies among elements affect the decision order in the SDM process for AI adoption in manufacturing. Selecting an energy meter (an attribute) as the sensor (an element) implicitly fixes the data type (another element) as numeric data (an attribute). In contrast, starting with the data type as numeric data may still allow for a wide range of sensor options, including energy, force, position, and pressure meters. Meanwhile, such interdependencies are also a kind of domain knowledge. Thus, decision-makers without domain knowledge can struggle to catch interdependencies among elements and determine appropriate decision orders. Therefore, defining an appropriate decision-making order can reduce the overall decision complexity and enhance the efficiency of an SDM support system. Accordingly, this study proposes a systematic, data-driven approach that provides guidelines for SDM in AI adoption for manufacturing, taking into account the interdependencies and uncertainties inherent in SDM. In particular, this study emphasizes minimizing the total uncertainty that decision-makers face throughout the SDM process by identifying a decision order that maximizes uncertainty reduction at each stage.
Conditional entropy (CE), a fundamental concept in information theory, measures the amount of uncertainty remaining about a random variable Y given that another variable X is known. It is a useful tool for quantifying how much information is gained when one variable is revealed [13]. By applying this concept to SDM, we can evaluate the uncertainty reduction at each decision point. In addition, based on this idea, we develop an algorithm that identifies the efficient order of SDM elements that yields the maximized reduction in CE at each stage. To implement this, we propose a reinforcement learning (RL) model that uses CE as the reward function within the Q-learning framework. RL is well-suited for solving decision-making problems involving multiple interdependent elements [14]. Among various RL algorithms, Q-learning is particularly suitable due to its broad applicability [15]. However, the integration of CE and Q-learning for solving SDM problems has been largely unexplored, highlighting the need to fill this research gap.
This study aims to develop a data-driven SDM model, built upon an extensive dataset of real AI adoption cases, that helps SMMs determine the efficient order of decision elements for AI adoption in manufacturing. To this end, we investigate how uncertainty can be quantified and minimized through the integration of information-theoretic metrics and RL within the SDM framework. Then, we validate the proposed SDM model through simulation to assess its practical effectiveness, since simulation provides a controlled environment to test the SDM under realistic constraints. Through this approach, this study contributes to structured decision-making in AI adoption by proposing a novel SDM model and offering practical support for overcoming the challenges that SMMs face. Furthermore, the proposed framework is designed to be scalable and policy-relevant, providing a foundation for broader applications of structured SDM across diverse domains.

2. Literature Review

This section reviews various approaches to solving SDM problems. This section also defines SDM’s elements and their corresponding attributes based on a review of 55 studies on AI adoption in manufacturing. These 55 studies also serve as training data for the proposed model. In addition, RL and the Q-learning algorithm are reviewed to be utilized in the development of the SDM model.

2.1. Sequential Decision-Making

SDM problems have been widely studied in various domains such as robotics, finance, and healthcare [11,16]. However, the goal of existing SDM studies is to select a sequence of actions to maximize external task rewards. For this reason, most prior studies focus on the temporal dependency of sequential action decisions, rather than considering the interdependencies or uncertainties among SDM elements.
Chen et al. (2021) addressed an SDM problem in which actions unfold in a temporal sequence, focusing on diverse environments such as games and robotic control [17]. More specifically, the proposed model was trained to select an initial action that leads to a sequence of subsequent actions yielding the highest overall reward. In this setting, earlier actions help predict and shape later actions through temporal dependencies. In contrast, SDM for AI adoption in manufacturing is characterized by logical dependencies because the selection of one decision element constrains the feasible options for others. Such an SDM problem can be solved not with temporal progression, but with how decisions reduce uncertainty in the decision structure.
Lee et al. (2022) addressed an SDM problem involving the selection of appropriate actions in various game environments [18]. Their approach focused on learning temporal patterns from observed trajectories to generate high-performing sequences that maximize cumulative rewards. In this approach, the SDM model predicts each next event by relying on the learned relationships between earlier and later events in the trajectory, emphasizing temporal continuity. However, SDM for AI adoption in manufacturing does not follow a fixed temporal sequence but rather involves selecting interdependent decision elements. In such structured decision spaces, the key challenge lies in determining which decision reduces the overall uncertainty in the SDM process, where logical constraints exist between elements, rather than simply predicting the next step in a temporal sequence.
Janner et al. (2021) solved an SDM problem focused on reproducing high-quality decision sequences through imitation, particularly in tasks such as robotic manipulation and navigation [19]. Their method aimed to generate successful behavior sequences by imitating trajectories from prior examples, using a diffusion-based process that gradually refines random inputs. More specifically, the model learns to produce sequences that resemble expert demonstrations, relying on observed action chains without explicitly modeling decision dependencies. In contrast, SDM in AI adoption for manufacturing requires not imitation but deliberate structuring of decision order, in which the selection of one element logically determines or limits the available attributes for other elements to be decided. Thus, the focus is on managing structural interdependencies and uncertainty rather than on recreating previously successful patterns.
SDM problems have also been examined in autonomous driving, focusing on sequential maneuvers in dynamic traffic environments [20]. In the energy domain, hierarchical RL has been applied to support the SDM process in regional market trading temporal processes [21]. In agriculture, SDM concepts have been explored in computer vision tasks such as real-time poultry monitoring [22]. Human-in-the-loop learning has also been studied as an SDM setting, with approaches such as RL from feedback and preference-based learning [23]. These studies, however, primarily emphasize temporal decision sequences and do not explicitly address the structural interdependencies among decision elements.
Prior SDM studies have primarily focused on predicting sequences of actions based on past trajectories, treating each action as a time-indexed step toward maximizing a cumulative reward. However, solving the SDM problem of AI adoption in manufacturing needs to focus on the efficient order of structured decision elements, in which each choice constrains the space of subsequent options. In this setting, quantifying uncertainty reduced at each step can be critical to identifying a decision order. Therefore, the methodologies and approaches proposed in existing studies are not suitable for modeling SDM for AI adoption in manufacturing. Instead, information-theoretic methods can be more effectively utilized in this context, as they allow for quantifying the uncertainty.

2.2. Decision-Making Based on Information Theory

Information-theoretic approaches are widely used for analyzing decision-making structures and solving related problems. Information theory, introduced by Claude Shannon in 1948, provides a mathematical framework for quantifying uncertainty and measuring the amount of information to design more efficient decision-making processes [13,24]. In information theory, uncertainty is numerically expressed through the concept of entropy. Entropy quantifies the degree of uncertainty associated with a random variable and is calculated as shown in Equation (1) [13,24]. Entropy H X of a discrete random variable X is calculated based on the probability p x of each possible value x that X can take. A higher entropy value H X indicates a greater level of uncertainty and a more dispersed distribution of information.
H X = x X p x log 2 p x
In decision-making problems, various entropy-derived information-theoretic metrics—such as mutual information (MI), total correlation (TC), and CE—are commonly used. MI represents the amount of information shared between two random variables. MI between random variables X and Y is calculated based on the joint probability p x , y of the simultaneous occurrence of two events, along with the marginal probabilities p x and p y , as shown in Equation (2) [13,24]. When the MI value I X ; Y is zero, it indicates that X and Y are statistically independent [24]. MI measures how much two random variables are informationally connected—that is, it quantifies the degree of dependence between them.
I X ; Y = x X y Y p x , y log 2 p x , y p x p y
Since MI is defined for a pair of variables, other metrics such as TC are used to analyze informational dependencies among three or more variables. TC measures the shared information (or redundancy) among all variables in a given set [24]. For a set of random variables { X 1 , , X m } , the TC value C ( X 1 ; ; X m ) is defined as shown in Equation (3) [24]. In this equation, H ( X 1 , , X m ) represents joint entropy of all variables, and it is calculated as Equation (4).
C X 1 ; ; X m = i = 1 m H X i H ( X 1 , , X m )
H X 1 , , X m = x 1 X 1 x m X m p x 1 , , x m log 2 p x 1 , , x m
Since MI and TC measure the amount of redundant information shared among variables, it is commonly used for variable selection (i.e., choosing the most relevant input features) or dependency analysis (i.e., identifying how strongly variables are related to each other). Qian and Shu proposed an MI-based variable selection algorithm aiming to reduce redundancy and improve classification performance through a forward greedy search strategy [25]. Todorov and Setchi proposed an MI-based variable selection method for classification tasks using discrete data [26]. Qiu and Niu proposed a variable selection method based on TC-based information metric to reduce redundancy among variables, aiming to improve classification performance in high-dimensional data [27]. Li et al. compared MI and TC to quantify the multivariate dependency among neural signals, enabling the analysis of functional connectivity across brain regions [28]. Collectively, previous studies based on MI and TC have primarily focused on quantifying the redundancy or relevance among variables. While MI and TC are well-suited for global variable selection problems, they are not suitable for solving SDM problems.
One promising approach to solving the SDM problem is to minimize the remaining set of options (i.e., uncertainty) at each stage. In such cases, CE can serve as a useful measure for guiding the decision-making process. CE of a random variable Y given another random variable X is quantified by the expression in Equation (5). CE measures the uncertainty that remains about Y when X is known [13]. A CE value of zero indicates that Y is statistically dependent on X ; that is, knowing X completely eliminates the uncertainty in Y [24]. Conversely, CE reaches its maximum when X and Y are statistically independent [24].
H Y X = x X , y Y p x , y log 2 p ( y | x )
Although these characteristics of CE make it well-suited for solving SDM problems, a very limited number of studies have directly applied CE to decision-making problems. Dai et al. proposed a CE-based attribute selection (i.e., removing less important attributes to simplify decision-making) method for incomplete decision systems, aiming to reduce uncertainty in classification [29]. Yang et al. proposed a CE-based entropy model, aiming to quantify uncertainty in fuzzy decision information [30].
Overall, information-theoretic approaches have been widely applied to decision-making problems to quantify uncertainty, identify relevant variables, and analyze informational dependencies. As summarized in Table 1, most prior studies have focused on variable selection or reducing redundancy to improve classification accuracy using metrics such as MI and TC. These metrics have proven to be effective for global variable selection or dependency analysis, in which the overall structure of relationships among variables is of primary concern. Table 2 summarizes the key properties of MI, TC, and CE. To begin with, MI is not appropriate to deal with three or more variables. Moreover, the most critical point is that MI and TC are fundamentally used to assess the amount of shared or redundant information across multiple variables. However, solving SDM problems that aim to minimize remaining uncertainty at each decision stage requires a different perspective. In this context, the ability to quantify uncertainty under conditional circumstances is critical, as decisions at later stages depend on those made earlier. In such cases, CE serves as a more appropriate metric, as it measures the remaining uncertainty of a target variable given a known condition. Despite such suitability of CE, no known studies have directly applied CE to SDM structures. Therefore, to address this research gap, this study aims to propose a CE-based SDM model. Specifically, this model minimizes CE at each stage to support more efficient decision-making processes under multi-stage decision structures.

2.3. Sequential Decision-Making Elements and Attributes for AI Adoption in Manufacturing

We review 55 previous studies on AI adoption in manufacturing to identify the key decision-making elements and the corresponding attributes of each element (see Table A1 in Appendix A). In particular, the review focuses on the machining process, as it is one of the most commonly applied, representative processes in manufacturing [31]. Furthermore, to obtain sufficient information for decision-making model development, we select studies that clearly describe the data collection methods, the AI technique applied, and the objectives of AI adoption. According to the elements commonly identified across the 55 reviewed studies, we derive eight decision-making elements: purpose, task, sensor, data type, data collection interval, data collection period, data dimension, and AI technique. In addition, we list the corresponding attributes for each element in Table 3.
Adopting AI in manufacturing inevitably involves both a purpose and a task. For example, AI can be introduced to predict energy consumption, diagnose faults, or optimize production time. Since AI relies on data, implementing it involves key decisions regarding how data will be collected, such as sensor selection, data type, and collection parameters. These data-related decision-making elements include the sensor, data type, data collection interval, data collection period, and data dimension.
Sensors can be selected from various attributes such as cameras, energy meters, and force sensors. The data collected from these sensors can take the data type of images, numerical values, or strings. The data collection interval and data collection period are typically determined at the discretion of the user.
Data dimension can be broadly categorized into one-dimensional (1D) and two-dimensional (2D) types, and each can have several attributes depending on scale and resolution. In this study, we define 1D data as single vector-type data that varies over time or sequence (e.g., sound, text, and vibration), and 2D data as matrix-type data with spatial or visual structures (e.g., images and position grids). The data dimension in this study refers to the raw sensor-level data and does not consider the transformed input structures that can be used later for AI models.
Finally, the AI technique element includes widely used methods, such as clustering, k-nearest neighbor (kNN), neural networks (NNs), and support vector machines (SVMs). Attributes that do not fall into predefined categories are grouped under ‘others’.
In addition, based on the same 55 studies reviewed to identify decision-making elements and their attributes, we extract a total of 217 decision cases for the development of the data-driven SDM model (see Table A1 in Appendix A). These 217 decision cases describe how the attributes of each element were selected in actual AI adoption scenarios. These decision cases are then converted into categorical variables and used to calculate CE among the decision-making elements within the SDM model. Table 4 illustrates examples of the data, encoded as categorical variables, that are utilized in the development of the model.

2.4. Reinforcement Learning and Q-Learning for Sequential Decision-Making

Several approaches have been proposed to model situations in which elements are sequentially selected based on entropy reduction or information gain. For example, decision-tree algorithms sequentially partition attributes based on information gain (i.e., entropy reduction). However, decision-tree algorithms generally evaluate local uncertainty reduction at each split and cannot explicitly account for how earlier choices affect the uncertainty of later ones [32,33].
As reviewed in Section 2.3, decision-making on AI adoption in manufacturing involves eight interdependent decision elements, each with three to ten possible attributes. That is, this SDM problem is significantly complex, and each choice influences the others. Solving complex SDM problems can be formulated as a Markov decision process (MDP) [34]. MDP provides a mathematical framework for modeling decision-making in situations where outcomes are partly random and partly under the control of decision-makers [35]. More specifically, an MDP calculates the expected utility of each possible action in a given state [35]. However, theoretical approaches using MDPs often face practical limitations due to the computational cost of constructing a transition probability matrix (TPM) for all possible state-action pairs; this issue is commonly termed the curse of dimensionality [34]. To deal with this challenge, RL offers an alternative approach that does not require explicit modeling of TPM. Instead, RL enables the agent to learn optimal decision policies through interaction with the environment [14].
Several prior studies have applied entropy-based measures within RL frameworks to address SDM problems. For example, Filippi et al. (2010) proposed an RL algorithm using Kullback–Leibler divergence, an entropy-derived metric, mathematically addressing the trade-off between exploration and exploitation [36]. Li et al. (2021) introduced an RL framework for active feature acquisition, aiming to balance predictive accuracy with acquisition costs by evaluating feature redundancy using MI [37]. Similarly, Maliah and Shani (2018) applied RL to model cost-sensitive classification as an MDP, incorporating entropy-based split criteria commonly used in decision trees [38]. While these studies demonstrated the usefulness of entropy-related metrics within RL frameworks, their primary objectives are to improve prediction accuracy or to reduce costs. In addition, the prior approaches did not explicitly quantify logical interdependencies among decision elements or minimize overall uncertainty across sequential choices. Addressing this gap, our study employs CE to explicitly capture interdependencies among decision elements and to reduce cumulative uncertainty in the SDM process.
In RL, the agent continuously interacts with the environment in a feedback loop. At each time step t , the environment provides the agent with a state s t , representing the current situation. In response to the state, the agent selects an action a t , and the environment returns a reward r t + 1 , a scalar signal indicating the immediate outcome of the chosen action. Over time, the agent aims to learn an optimal policy that maximizes the cumulative reward through experience and policy updates [14].
A central concept in RL is the Q-value, which estimates the expected return for an agent taking action a in state s and subsequently following a policy π . The Q-value Q π ( s , a ) is defined based on the Bellman expectation equation, as shown in Equation (6) [14,39].
Q π s , a = E π k = 0 γ k r t + k + 1 s t = s ,   a t = a  
Here, γ is the discount factor, which determines the present value of future rewards, and the expectation E π is taken over all possible future trajectories generated by policy π . This formulation is used to evaluate a given policy. To find the optimal policy π * , the Bellman optimality equation defines Q * s , a as the maximum expected return over all possible next actions, as shown in Equation (7) [14,39].
Q * s , a = E r t + 1 + γ max a Q * s t + 1 , a s t = s ,   a t = a
In this equation, a denotes a possible action available in the next state s t + 1 , and the maximization identifies the optimal action with the highest expected return [14].
Among various RL algorithms, Q-learning is one of the most commonly used algorithms due to its applicability and simplicity [15]. Q-learning iteratively updates the Q-values at each step, allowing the agent to learn to select the action with the highest Q-value in each state [15]. One notable advantage of Q-learning is its ability to directly reflect the objective of decision-making within the reward function. For example, Khriji et al. defined a reward function based on route suitability to determine the optimal path for a mobile robot [40]. Similarly, Chakole et al. formulated a reward function using profits and losses to develop optimal trading strategies in the stock market [41].
This study aims to develop an SDM model leveraging the characteristics of the Q-learning algorithm. Specifically, at each decision-making stage (state), the policy determines which element to select (action), and the reward function is defined by the change in CE resulting from that selection. By using CE within the reward function, the model learns an efficient order of decisions that minimizes uncertainty throughout the sequence.
Although many prior studies have proposed methodologies for solving SDM problems, these approaches are not directly applicable to modeling SDM for AI adoption in manufacturing due to their primary focus on temporal dependencies. In addition, while many prior studies have extensively applied information-theoretic measures such as MI and TC for decision-making, these metrics are not well-suited for solving the SDM problems that aim at reducing uncertainty and determining efficient decision order. In such problems, CE is more appropriate. Nevertheless, CE has not been directly applied to SDM problems. Likewise, although RL and Q-learning have been widely used in complex decision-making scenarios, their integration with CE to support SDM remains underexplored. These observations reveal three key research gaps: (i) the lack of SDM methodologies tailored to AI adoption in manufacturing, (ii) the limited use of CE for uncertainty reduction in structured decision processes, and (iii) the absence of a combined CE and Q-learning approach for modeling SDM. To address these gaps, this study proposes a CE-based SDM model for AI adoption in manufacturing. The model uses Q-learning, with CE as the reward, to identify the efficient order of decision elements that minimizes the total amount of uncertainty in the entire SDM process. We define eight decision elements and their attributes by reviewing 55 prior studies and construct a dataset of 217 decision cases. We then validate the model through simulations under varying levels of prior knowledge. This study provides a scalable decision support approach, especially useful for SMMs with limited AI expertise. The proposed model also has the potential to support a wide range of SDM problems beyond manufacturing by providing a generalizable framework for structured, uncertainty-aware decision-making.

3. Proposed Sequential Decision-Making Model

To develop an SDM model, we design an RL model based on a Q-learning algorithm, as outlined in Table 5 and Table 6, in which the key notations and the overall structure of the model are, respectively, summarized. Among the eight decision-making elements identified in Section 2.3, purpose and task are considered prerequisites for AI adoption in manufacturing. Without defining these two elements first, the subsequent decision-making process cannot proceed meaningfully. Moreover, determining purpose and task is not a domain knowledge-dependent decision-making. Therefore, we assume that purpose and task are predetermined, and the objective of the SDM model is to determine the efficient order for the remaining six elements.
In the formulations in Table 6, E is the full set of candidate elements e , and S denotes the set of elements that have already been selected. The state s is defined as the set difference E \ S , i.e., the elements yet to be selected. Although this state is defined based on the milling process data used in this study, it is generalized to be extensible to SDM problems in other processes. The action a refers to selecting one element x E \ S in the current state. The reward function r ( x ) measures the reduction in CE across the remaining elements E \ ( S { x } ) when the candidate element x is added to the selected set S . The reward is scaled by a penalty term ( 1 u x ) , where u x denotes the proportion of unknown attributes associated with element x . In this context, unknown attribute indicates that the applicable value for a given element is not identifiable and, therefore, is treated as a form of missing data. Some elements, such as data collection interval and data collection period, include many attributes identified as unknown. Since all attributes are treated as categorical variables in the model, unknown values can be misinterpreted as informative. To address this issue, we incorporate a penalty into the reward function based on the proportion of unknown values. This adjustment prevents the model from overestimating the informational value of elements with unknown attributes. Based on the state, action, and reward defined in Table 6, the Bellman optimality equation for this problem can be written as Equation (8).
Q * ( s , a ) = r ( x ) + γ max a E ( S { x } ) Q * ( s , a )
In this equation, s denotes the next state obtained by adding the selected element x to the set of previously chosen elements (i.e., s = E ( S { x } ) ). This formulation explicitly incorporates the reward function defined in Table 6 and reflects how the choice of element x influences subsequent states in the SDM process.
Additionally, the Q-learning model we developed is trained based on the hyperparameters as shown in Table 7. These hyperparameters are set referring to the previous studies [42,43].
The algorithm operates iteratively, as illustrated in Figure 1. The algorithm begins with an initial order list that includes the two predefined elements (purpose and task), forming the initial selection set S . The remaining elements in E \ S comprise the candidate pool for the next action. At each decision stage, a Q-table is initialized to store the expected Q-value for each selectable element. The agent explores multiple episodes, balancing the exploration of new actions and the exploitation of previously learned Q-values. For each action x E \ S , the reward is computed based on CE reduction. After learning (i.e., exploration), the agent selects the element with the highest Q-value as the optimal action a * . This selected element is then added to the order list, and the state is updated by including it in S . The process repeats until all elements are selected (i.e., E \ S = ). This algorithm yields a sequentially optimized order of decision elements that minimizes CE at each stage while accounting for a data quality issue with unknown attributes through reward penalization with u x .

4. Results and Discussion

This section presents the order of SDM for AI adoption in manufacturing, as derived from the proposed model, and provides simulation-based validation results to evaluate whether the derived order is indeed efficient. Following the predetermined order of purpose and task, the proposed model in this study identified the decision sequence of the remaining six elements that most effectively reduces uncertainty: (i) sensor, (ii) data collection interval, (iii) data dimension, (iv) AI technique, (v) data type, and (vi) data collection period. This order is derived by selecting the element with the highest Q-value at each decision stage. The Q-values calculated for each element at each decision stage are summarized in Table 8.
To validate whether the model-derived order is an efficient solution to SDM problems, we conduct a simulation in which each trial (hereafter referred to as an ‘episode’) demonstrates a complete sequence of decision-making. Specifically, each episode determines (i) the order of six decision-making elements and (ii) the attribute selected for each element. Once the order and attributes of all six elements are finalized, the episode is considered complete. Additionally, the order and attributes of purpose and task are predefined at the beginning of each episode and are not included in the element ordering and attribute selection.
We define four simulation cases that differ depending on the level of knowledge about the decision-making order of six elements: (a) full order known, (b) relative order known for two elements, (c) exact positions known for two elements, and (d) no order known. These four cases are illustrated in Figure 2.
In Case (a), decisions are made according to the SDM order derived from our CE-based Q-learning model (hereafter, the model-derived order). In contrast, Cases (b)–(d) do not represent additional learned methods or outcomes; they are reference scenarios that partially use the model-derived order or do not use it at all. Case (b) assumes that the relative order between two elements in the model-derived order is known. For instance, information such as ‘sensor must be selected before data type’ or ‘AI technique should be decided before the data collection period’ is assumed to be known. In Case (b), the two elements of the known pairwise order are randomly selected first, and then the remaining four elements are randomly ordered with a uniform distribution. In Case (c), the exact positions of two elements in the model-derived order are known. For example, information such as ‘sensor should be selected first, and data dimension should be placed third in the decision order’ is assumed to be known. Also in Case (c), the two elements of the known positions are randomly selected first, and then the other four elements are randomly ordered with a uniform distribution. In Case (d), the decision-maker is assumed not to know the decision order at all, and thus all elements are randomly ordered with a uniform distribution.
When the order of six elements is determined, an attribute is selected for the element assigned to each stage. The attribute selection follows rule-based constraints defined in Table 9 to ensure that the simulation reflects realistic conditions. These constraints are based not on specialized domain knowledge, but rather on intuitive and widely accepted interrelationships among elements. For example, there are natural links between certain sensors and data types, or the typical requirements of data dimensions for different AI techniques. More specifically, if a sensor is of a camera type, the resulting data must be in image format, and correspondingly, the data dimension must be 2D. Additionally, except for NN models, most AI techniques cannot effectively capture spatial information in matrix-structured data (i.e., 2D data) unless the data are flattened into 1D form. Therefore, these constraints are essential to prevent the simulation from generating infeasible or unrealistic decision orders that would not be applicable in real-world SDM applications.
Constraints in Table 9 represent directional dependencies between elements. That is, each constraint consists of a trigger condition—the selection of a specific attribute of an element—and a resulting constraint that restricts the feasible attribute of specific elements. During the simulation, attributes are selected randomly from the set of candidates using a uniform distribution by default. However, if a previously selected attribute triggers a constraint, the current element’s attribute is restricted to the feasible attributes allowed by the constraint. In addition, if a constraint allows multiple feasible attributes, one of them is randomly selected using a uniform distribution. For example, if the attribute selected for the data type at the previous stage is ‘image’, then the sensor type allowable at the current stage is restricted to ‘camera’ or ‘others’ due to Constraint 7 in Table 9.
The constraints are not only applied during the attribute selection process but also used as evaluation criteria to assess the feasibility of each simulation case. To this end, we define and detect two types of constraint-related issues during the simulation: constraint violations and constraint conflicts.
A constraint violation occurs when a constraint is triggered but not satisfied due to the decision order. This happens when the element referring to the trigger condition (i.e., ‘if’) intervenes after the element referring to the constraint (i.e., ‘then’). For example, assume that the sensor is selected as ‘camera’ at a later stage, but the data type was already set to ‘numeric’ at an earlier stage. In this case, Constraint 1 is violated. To resolve this violation, the simulation backtracks to the earlier stage and revises the attribute of the already selected element to satisfy the constraint (e.g., ‘numeric’ to ‘image’).
A constraint conflict occurs when two or more constraints require conditions having no intersection to be satisfied. This happens when earlier decisions lead to contradictory requirements. For instance, assume that the AI technique is selected as ‘tree’, which requires a 1D data dimension (Constraint 8), while the data type is set to ‘image’, which requires a 2D data dimension (Constraint 6). These constraints conflict because no single data dimension can satisfy both constraints simultaneously. In this case, the simulation performs a backtracking from the stage determining data dimension to the stage determining data type (the latter-decided element). Then, the attribute of data type is revised (e.g., ‘image’ to ‘numeric’) to resolve the conflict.
The attribute selection and backtracking process is repeated until no further backtracking is required, ensuring that all episodes are completed successfully. Constraint violation and conflict detection are performed at every stage of each episode. If backtracking occurs at any point within an episode, that episode is labeled as a ‘backtracked episode’. The total number of backtracked episodes is used as a key indicator of the robustness and feasibility of the decision order for each simulation case.
For each of the four cases, 1000 episodes are simulated. The number of backtracked episodes for each case is as follows: zero for Case (a), 60 for Case (b), 84 for Case (c), and 81 for Case (d). Since no backtracking occurred in Case (a), the model-derived order can be regarded as the most efficient decision-making order. Figure 3 visualizes one example episode out of 1000 episodes following each case. In particular, examples of backtracked episodes for Cases (b), (c), and (d) are presented in Figure 3; elements with predefined order information are highlighted using underlines for each case.
The Case (a) example shows that no constraint violations or conflicts occurred when decisions are made according to the model-derived order, resulting in only forward decision flow and no backtracking. This result suggests that the model-derived order is appropriately structured so that trigger conditions are always determined before the corresponding constraints are applied. Thus, it prevents infeasible decisions and conflicts among constraints. In contrast, in Case (b), only the information that the AI technique should be decided before the data type is provided. In this example, Constraint 8 conflicts with Constraint 6, since the AI technique selected at Stage 1 is ‘tree’ (i.e., not ‘NN’), which requires the data dimension in Stage 4 to be 1D (Constraint 8). However, at Stage 3, the data type is set to ‘image’, which requires the data dimension to be 2D (Constraint 6). This contradiction leads to a violation of the later-applied Constraint 6. Thus, the process backtracks from Stage 4 to Stage 3 and changes the data type to ‘numeric’. In Case (c), it is assumed that the data collection interval is selected as the second element and the AI technique as the fourth. From the Case (c) example, Constraint 5 is violated at Stage 6; therefore, the decision process rolls back to Stage 1 to revise the attribute of the data type to ‘numeric’. Meanwhile, in Case (d), all of the elements are ordered randomly. The example of Case (d) is similar to that of Case (b); Constraints 6 and 8 conflict since the data type is selected as ‘image’ at Stage 1 and the AI technique is selected as ‘SVM’ at Stage 4. In this episode, since Constraint 6 is applied earlier, the process backtracks from Stage 5 to Stage 4, updating the AI technique to ‘NN’.
We quantify the sum of the uncertainty of all remaining decision elements at each stage based on the adjusted CE defined in Equation (9). This formulation is derived from the reward function used in our RL model and represents a modification of the original CE by considering the informational validity. Specifically, it applies a penalty term ( 1 u e ) , accounting for the proportion of unknown attributes in element e .
A d j u s t e d   C E = e E \ S H ( e | S ) ( 1 u e )  
Table 10 and Figure 4 present the adjusted CE values computed at each stage of the SDM process for the four cases. These values represent the average adjusted CE across 1000 episodes for each case at each stage, thereby capturing the general pattern of uncertainty reduction in each case. Among the cases, Case (a) consistently shows the lowest adjusted CE at every stage, followed by Case (b). This indicates that Case (a) reduces uncertainty most rapidly and efficiently across the sequence. In contrast, Cases (c) and (d) exhibit higher CE values at the early and middle stages, indicating that decisions at several stages failed to consider interdependencies comprehensively. These results align with the number of backtracked episodes; Case (a) shows no backtracked episode, and Case (b) shows fewer backtracked episodes compared to Case (c) and Case (d). In addition, to quantify the overall level of uncertainty present throughout the SDM process, the area under the curve (AUC) of adjusted CE is calculated for each case. A smaller AUC indicates a more efficient order in minimizing cumulative uncertainty. Consistent with the stage-wise CE results, Case (a) shows the smallest AUC (6.367), followed by Case (b) (9.291), while Cases (c) and (d) show substantially higher AUC values (11.072 and 10.741, respectively).
From the case studies provided above, one of the key findings is that an efficient SDM order is an order that systematically reduces interdependency-based uncertainty among decision elements at each stage. Reducing such uncertainty not only facilitates more informed decisions but also minimizes the need for decision revision, leading to a more stable and efficient decision-making process. The reason why a decision order that quickly reduces uncertainty leads to fewer instances of backtracking can be understood by considering the interdependencies among decision-making elements. In SDM settings, selecting one element narrows down the feasible options for the remaining elements. CE quantifies how much uncertainty remains about an element given what has already been selected. When an element with high dependency on others is chosen early (i.e., one that results in a large CE reduction), the remaining space of valid decisions becomes more constrained and easier to scrutinize. This naturally decreases the probability of selecting infeasible attributes at later stages, thereby reducing the possibility of violating constraints. In contrast, if uncertainty remains high at the early stages (i.e., CE is not sufficiently reduced), the SDM process may proceed in a way that superficially captures the logical interdependencies among elements, leading to backtracking and revisions. Therefore, ‘minimizing CE early’ can prevent backtracking and contribute to efficient decision flow.
Another key finding from the simulation-based validation results is the critical importance of understanding relative order relationships between decision-making elements. In the AI adoption process for manufacturing, the selection of one element often constrains or directly influences the feasible attributes of subsequent elements. For example, once a specific sensor type is chosen, it naturally restricts the collectible data types and data dimensions. Therefore, when decision-makers know about the relative order between elements (e.g., sensor before data type), this knowledge is useful to avoid infeasible decisions and reduce backtracking. This explains why Case (b), which involved relative pairwise order knowledge, showed fewer backtracked episodes than Case (c), which involved only fixed position information. The findings suggest that establishing relational knowledge among elements is more crucial than knowing the absolute position of a few individual elements when managing complex SDM tasks.
In addition, the simulation results imply that partial knowledge of the decision order does not always lead to better outcomes. Notably, although all six elements are selected randomly in Case (d), backtracking occurs fewer times than in Case (c). This suggests that incomplete or imprecise guidance may mislead decision-makers to inefficient decisions. Especially in SMEs, decisions are often made based on fragmented knowledge or intuition due to limited expertise. Accordingly, structured decision support, such as the model proposed in this study, can offer more reliable guidance than intuition-driven approaches, particularly in environments with complex interdependencies among decision elements.
CE proves to be an effective metric for identifying dependencies between decision-making elements in SDM problems. CE directly quantifies the remaining uncertainties after selections, allowing the model to prioritize decisions that most significantly reduce overall uncertainty. Therefore, CE is highly useful for understanding and modeling the interdependencies among decision elements, which is critical for constructing a coherent and feasible decision order.
The RL model designed in this study, based on a Q-learning algorithm, effectively discovers an efficient order of decision-making elements that minimizes CE at each stage. By exploring different selection paths and maximizing the reduction in CE, the RL agent successfully captures the underlying dependencies among elements without requiring explicit prior knowledge. In addition, by applying a penalty to elements with a high proportion of unknown attributes, the model prevents overestimation of uncertain information and makes more reliable and informed decisions. This approach enables the RL model to optimize the decision order by maximizing information gain. In addition, by intelligently handling data quality issues with unknown attributes, this approach increases the practical robustness of the decision-making process.

5. Conclusions

The adoption of AI in manufacturing is rapidly accelerating. However, many SMMs lack the domain knowledge necessary to make structured decisions regarding AI adoption. To address this challenge, we propose a novel, data-driven SDM model that supports SMMs in determining an efficient order of decision elements for AI adoption. The model systematically identifies the order that minimizes uncertainty at each decision stage by using CE as a reward function in RL. In addition, the model addresses data quality issues through the application of a Q-learning-based approach.
The development of the model begins with the identification of major decision-making elements and their corresponding attributes for AI adoption in manufacturing, based on a review of 55 prior studies. Then, a dataset of 217 decision cases is constructed to support model training (see Table A1 in Appendix A). Additionally, the model is validated through simulations across different levels of decision-related knowledge. Simulation results demonstrate that following the model-derived order eliminates backtracking during the decision process. Moreover, the model-derived order reduces uncertainty most rapidly and has the lowest cumulative uncertainty across all stages. These results indicate that the proposed model can be effective in generating a practical and feasible decision order. Furthermore, the results highlight that understanding relative pairwise order between elements is more informative and helpful than knowing the exact positions of two elements within the decision-making order. These findings suggest that the proposed model not only optimizes decision flow but also provides a scalable and systematic framework for guiding AI adoption in manufacturing sites. Although this study focuses on data from milling processes, the decision elements and attributes used in our training data can also be commonly applied across other manufacturing processes. Importantly, the proposed SDM model is domain-agnostic and can be applied to a wide range of SDM problems involving interdependent decision elements. Thus, this research contributes not only to manufacturing applications but also to the broader field of decision support by providing a scalable and generalizable methodology for SDM. However, to extend the framework to new domains, issues such as data sparsity and integration with existing decision-support systems may require the model and data to be further refined or expanded.
While the simulation-based validation offers conceptual evidence of the efficacy of the proposed model, it cannot fully represent real industrial SDM processes. Future research should therefore empirically validate the model in the industrial sites to assess both decision quality and practical effectiveness.

Author Contributions

Conceptualization, G.-h.L. and H.-w.J.; methodology, G.-h.L. and H.-w.J.; software, G.-h.L.; validation, G.-h.L. and H.-w.J.; formal analysis, G.-h.L.; investigation, H.-w.J. and B.S.; resources, B.S.; data curation, G.-h.L.; writing—original draft preparation, G.-h.L.; writing—review and editing, H.-w.J. and B.S.; visualization, G.-h.L.; supervision, H.-w.J.; project administration, H.-w.J.; funding acquisition, H.-w.J. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Korea Institute of Energy Technology Evaluation and Planning (KETEP) and the Ministry of Trade, Industry & Energy (MOTIE) of the Republic of Korea (20224000000260). This research was also partially funded by the BK21 FOUR program of Graduate School, Kyung Hee University (GS-1-JO-NON-20240356).

Data Availability Statement

Dataset available on request from the authors.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
AUC Area under the curve
CEConditional entropy
kNNk-nearest neighbor
MIMutual information
NNNeural networks
RLReinforcement learning
SDMSequential decision-making
SMEsSmall- and medium-sized enterprises
SMMs Small- and medium-sized manufacturers
SVMSupport vector machines
TCTotal correlation
TPM Transition probability matrix

Appendix A

Table A1. Literature on AI adoption in manufacturing.
Table A1. Literature on AI adoption in manufacturing.
No.Reference No.ReferencePurposeTaskSensorData TypeData Collection
Interval
Data Collection PeriodData DimensionAI
Technique
11[44]QualityDiagnosisVibrationNumeric<0.01 s<1 day<10 (1D)SVM
22[45]QualityDiagnosisForceNumeric<0.01 sUnknown<10 (1D)SVM
3QualityDiagnosisForceNumeric<0.01 sUnknown<10 (1D)Others
4QualityDiagnosisVibrationNumeric<0.01 sUnknown<10 (1D)SVM
5QualityDiagnosisVibrationNumeric<0.01 sUnknown<10 (1D)Others
6EnergyDiagnosisForceNumeric<0.01 sUnknown<10 (1D)SVM
7EnergyDiagnosisForceNumeric<0.01 sUnknown<10 (1D)Others
8EnergyDiagnosisVibrationNumeric<0.01 sUnknown<10 (1D)SVM
9EnergyDiagnosisVibrationNumeric<0.01 sUnknown<10 (1D)Others
103[46]QualityPredictionForceNumeric<0.01 sUnknown<10 (1D)NN
11QualityPredictionSoundNumeric<0.01 sUnknown<10 (1D)NN
12QualityPredictionVibrationNumeric<0.01 sUnknown<10 (1D)NN
13FaultMonitoringForceNumeric<0.01 sUnknown<10 (1D)NN
14FaultMonitoringSoundNumeric<0.01 sUnknown<10 (1D)NN
15FaultMonitoringVibrationNumeric<0.01 sUnknown<10 (1D)NN
16QualityPredictionCameraImageUnknownUnknownUnknownNN
17FaultMonitoringCameraImageUnknownUnknownUnknownNN
184[47]CostControl/OptimizationOthersNumeric<0.01 sUnknown10~100 (1D)SVM
19CostControl/OptimizationOthersNumeric<0.01 sUnknown10~100 (1D)NN
20QualityDiagnosisOthersNumeric<0.01 sUnknown10~100 (1D)SVM
21QualityDiagnosisOthersNumeric<0.01 sUnknown10~100 (1D)NN
22QualityControl/OptimizationOthersNumeric<0.01 sUnknown10~100 (1D)SVM
23QualityControl/OptimizationOthersNumeric<0.01 sUnknown10~100 (1D)NN
24EnergyDiagnosisOthersNumeric<0.01 sUnknown10~100 (1D)SVM
25EnergyDiagnosisOthersNumeric<0.01 sUnknown10~100 (1D)NN
265[48]CostDiagnosisSoundNumeric<0.01 sUnknown<10 (1D)NN
27CostDiagnosisVibrationNumeric<0.01 sUnknown<10 (1D)NN
286[49]CostPredictionCameraImage0.01~1 s1~7days>100 × 100 (2D)NN
29CostControl/OptimizationCameraImage0.01~1 s1~7days>100 × 100 (2D)NN
30QualityPredictionCameraImage0.01~1 s1~7days>100 × 100 (2D)NN
31QualityControl/OptimizationCameraImage0.01~1 s1~7days>100 × 100 (2D)NN
327[50]CostControl/OptimizationForceNumericUnknownUnknown10~100 (1D)SVM
33CostControl/OptimizationForceNumericUnknownUnknown10~100 (1D)Others
34EnergyPredictionForceNumericUnknownUnknown10~100 (1D)SVM
35EnergyPredictionForceNumericUnknownUnknown10~100 (1D)Others
36EnergyControl/OptimizationForceNumericUnknownUnknown10~100 (1D)SVM
37EnergyControl/OptimizationForceNumericUnknownUnknown10~100 (1D)Others
388[51]CostPredictionForceNumeric<0.01 sUnknown<10 (1D)NN
39CostPredictionSoundNumeric<0.01 sUnknown<10 (1D)NN
40CostPredictionVibrationNumeric<0.01 sUnknown<10 (1D)NN
41QualityPredictionForceNumeric<0.01 sUnknown<10 (1D)NN
42QualityPredictionSoundNumeric<0.01 sUnknown<10 (1D)NN
43QualityPredictionVibrationNumeric<0.01 sUnknown<10 (1D)NN
44EnergyPredictionForceNumeric<0.01 sUnknown<10 (1D)NN
45EnergyPredictionSoundNumeric<0.01 sUnknown<10 (1D)NN
46EnergyPredictionVibrationNumeric<0.01 sUnknown<10 (1D)NN
479[52]CostMonitoringSoundNumeric<0.01 sUnknown<10 (1D)NN
48CostDiagnosisSoundNumeric<0.01 sUnknown<10 (1D)NN
49QualityMonitoringSoundNumeric<0.01 sUnknown<10 (1D)NN
50QualityDiagnosisSoundNumeric<0.01 sUnknown<10 (1D)NN
5110[53]CostDiagnosisSoundNumericUnknownUnknown<10 (1D)NN
52CostDiagnosisSoundNumericUnknownUnknown<10 (1D)SVM
53TimeDiagnosisSoundNumericUnknownUnknown<10 (1D)NN
54TimeDiagnosisSoundNumericUnknownUnknown<10 (1D)SVM
5511[54]QualityDiagnosisForceNumericUnknownUnknown10~100 (1D)NN
56QualityDiagnosisVibrationNumericUnknownUnknown10~100 (1D)NN
57EnergyDiagnosisForceNumericUnknownUnknown10~100 (1D)NN
58EnergyDiagnosisVibrationNumericUnknownUnknown10~100 (1D)NN
5912[55]CostPredictionForceNumeric<0.01 sUnknown<10 (1D)NN
60QualityPredictionForceNumeric<0.01 sUnknown<10 (1D)NN
6113[56]CostMonitoringForceNumericUnknownUnknown<10 (1D)NN
62TimeMonitoringForceNumericUnknownUnknown<10 (1D)NN
6314[57]QualityMonitoringVibrationNumeric<0.01 s<1 day<10 (1D)NN
64QualityPredictionVibrationNumeric<0.01 s<1 day<10 (1D)NN
6515[58]QualityMonitoringForceNumeric<0.01 sUnknown<10 (1D)SVM
66QualityMonitoringForceNumeric<0.01 sUnknown<10 (1D)Others
6716[59]TimePredictionSoundNumeric<0.01 sUnknown<10 (1D)NN
68TimePredictionVibrationNumeric<0.01 sUnknown<10 (1D)NN
69QualityPredictionSoundNumeric<0.01 sUnknown<10 (1D)NN
70QualityPredictionVibrationNumeric<0.01 sUnknown<10 (1D)NN
7117[60]QualityMonitoringForceNumeric<0.01 sUnknown<10 (1D)NN
72QualityMonitoringForceNumeric<0.01 sUnknown<10 (1D)Others
73QualityPredictionForceNumeric<0.01 sUnknown<10 (1D)NN
74QualityPredictionForceNumeric<0.01 sUnknown<10 (1D)Others
7518[61]CostPredictionOthersNumericUnknownUnknown<10 (1D)NN
76TimePredictionOthersNumericUnknownUnknown<10 (1D)NN
7719[62]QualityPredictionForceNumericUnknownUnknown<10 (1D)NN
78QualityPredictionVibrationNumericUnknownUnknown<10 (1D)NN
79EnergyPredictionForceNumericUnknownUnknown<10 (1D)NN
80EnergyPredictionVibrationNumericUnknownUnknown<10 (1D)NN
8120[63]CostPredictionOthersNumericUnknownUnknown<10 (1D)SVM
82CostPredictionOthersNumericUnknownUnknown<10 (1D)NN
83CostControl/OptimizationOthersNumericUnknownUnknown<10 (1D)SVM
84CostControl/OptimizationOthersNumericUnknownUnknown<10 (1D)NN
85TimePredictionOthersNumericUnknownUnknown<10 (1D)SVM
86TimePredictionOthersNumericUnknownUnknown<10 (1D)NN
87TimeControl/OptimizationOthersNumericUnknownUnknown<10 (1D)SVM
88TimeControl/OptimizationOthersNumericUnknownUnknown<10 (1D)NN
8921[64]QualityPredictionForceNumeric<0.01 s<1 day<10 (1D)Tree
90QualityPredictionForceNumeric<0.01 s<1 day<10 (1D)Others
9122[65]TimePredictionVibrationNumericUnknownUnknown<10 (1D)SVM
92TimeControl/OptimizationVibrationNumericUnknownUnknown<10 (1D)SVM
93EnergyPredictionVibrationNumericUnknownUnknown<10 (1D)SVM
94EnergyControl/OptimizationVibrationNumericUnknownUnknown<10 (1D)SVM
95TimePredictionCameraImageUnknownUnknown>100 × 100 (2D)SVM
96TimeControl/OptimizationCameraImageUnknownUnknown>100 × 100 (2D)SVM
97EnergyPredictionCameraImageUnknownUnknown>100 × 100 (2D)SVM
98EnergyControl/OptimizationCameraImageUnknownUnknown>100 × 100 (2D)SVM
9923[66]QualityMonitoringVelocity/AccelerationNumeric<0.01 sUnknown<10 (1D)NN
100QualityMonitoringVelocity/AccelerationNumeric<0.01 sUnknown<10 (1D)Others
10124[67]QualityPredictionForceNumeric<0.01 sUnknown10~100 (1D)NN
102QualityPredictionVibrationNumeric0.01~1 sUnknown10~100 (1D)NN
103QualityControl/OptimizationForceNumeric<0.01 sUnknown10~100 (1D)NN
104QualityControl/OptimizationVibrationNumeric0.01~1 sUnknown10~100 (1D)NN
10525[68]QualityDiagnosisCameraImageUnknownUnknown>100 × 100 (2D)NN
106QualityControl/OptimizationCameraImageUnknownUnknown>100 × 100 (2D)NN
10726[69]CostPredictionSoundNumeric<0.01 s<1 day10~100 (1D)NN
108CostPredictionSoundNumeric<0.01 s<1 day10~100 (1D)Others
109TimePredictionSoundNumeric<0.01 s<1 day10~100 (1D)NN
110TimePredictionSoundNumeric<0.01 s<1 day10~100 (1D)Others
11127[70]TimePredictionEnergyNumeric<0.01 s<1 day<10 (1D)NN
112CostPredictionEnergyNumeric<0.01 s<1 day<10 (1D)NN
11328[71]QualityPredictionForceNumeric0.01~1 sUnknown<10 (1D)NN
11429[72]QualityDiagnosisVibrationNumeric<0.01 sUnknown<10 (1D)NN
115EnergyDiagnosisVibrationNumeric<0.01 sUnknown<10 (1D)NN
11630[73]CostPredictionForceNumeric<0.01 sUnknown<10 (1D)NN
117QualityPredictionForceNumeric<0.01 sUnknown<10 (1D)NN
11831[74]CostMonitoringVibrationNumeric<0.01 sUnknown<10 (1D)NN
119QualityMonitoringVibrationNumeric<0.01 sUnknown<10 (1D)NN
120EnergyMonitoringVibrationNumeric<0.01 sUnknown<10 (1D)NN
12132[75]QualityMonitoringOthersStringUnknownUnknown<10 (1D)NN
122QualityPredictionOthersStringUnknownUnknown<10 (1D)NN
123QualityMonitoringOthersNumericUnknownUnknown<10 (1D)NN
124QualityPredictionOthersNumericUnknownUnknown<10 (1D)NN
12533[76]CostMonitoringEnergyNumericUnknownUnknown<10 (1D)NN
126CostPredictionEnergyNumericUnknownUnknown<10 (1D)NN
127QualityMonitoringEnergyNumericUnknownUnknown<10 (1D)NN
128QualityPredictionEnergyNumericUnknownUnknown<10 (1D)NN
12934[77]CostPredictionOthersNumericUnknownUnknown<10 (1D)Tree
130CostPredictionOthersNumericUnknownUnknown<10 (1D)kNN
131CostPredictionOthersNumericUnknownUnknown<10 (1D)SVM
132CostPredictionOthersNumericUnknownUnknown<10 (1D)Others
133EnergyControl/OptimizationOthersNumericUnknownUnknown<10 (1D)Tree
134EnergyControl/OptimizationOthersNumericUnknownUnknown<10 (1D)kNN
135EnergyControl/OptimizationOthersNumericUnknownUnknown<10 (1D)SVM
136EnergyControl/OptimizationOthersNumericUnknownUnknown<10 (1D)Others
13735[78]QualityDiagnosisEnergyNumeric<0.01 sUnknown<10 (1D)NN
13836[79]QualityPredictionOthersNumericUnknownUnknown<10 (1D)NN
139QualityControl/OptimizationOthersNumericUnknownUnknown<10 (1D)NN
14037[80]QualityPredictionVelocity/AccelerationNumericUnknownUnknown<10 (1D)NN
14138[81]CostDiagnosisCameraImageUnknownUnknown>100 × 100 (2D)NN
14239[82]CostPredictionOthersNumericUnknownUnknown<10 (1D)NN
143CostPredictionOthersNumericUnknownUnknown<10 (1D)Others
144TimePredictionOthersNumericUnknownUnknown<10 (1D)NN
145TimePredictionOthersNumericUnknownUnknown<10 (1D)Others
14640[83]CostMonitoringForceNumeric<0.01 sUnknown<10 (1D)NN
147CostMonitoringVelocity/AccelerationNumeric<0.01 sUnknown<10 (1D)NN
148QualityMonitoringForceNumeric<0.01 sUnknown<10 (1D)NN
149QualityMonitoringVelocity/AccelerationNumeric<0.01 sUnknown<10 (1D)NN
150EnergyMonitoringForceNumeric<0.01 sUnknown<10 (1D)NN
151EnergyMonitoringVelocity/AccelerationNumeric<0.01 sUnknown<10 (1D)NN
15241[84]QualityDiagnosisVibrationNumeric<0.01 s<1 day<10 (1D)NN
153FaultDiagnosisVibrationNumeric<0.01 s<1 day<10 (1D)NN
15442[85]CostControl/OptimizationEnergyNumericUnknownUnknown<10 (1D)SVM
155CostControl/OptimizationEnergyNumericUnknownUnknown<10 (1D)NN
156CostControl/OptimizationForceNumericUnknownUnknown<10 (1D)SVM
157CostControl/OptimizationForceNumericUnknownUnknown<10 (1D)NN
158CostControl/OptimizationSoundNumericUnknownUnknown<10 (1D)SVM
159CostControl/OptimizationSoundNumericUnknownUnknown<10 (1D)NN
160CostControl/OptimizationVibrationNumericUnknownUnknown<10 (1D)SVM
161CostControl/OptimizationVibrationNumericUnknownUnknown<10 (1D)NN
162TimeControl/OptimizationEnergyNumericUnknownUnknown<10 (1D)SVM
163TimeControl/OptimizationEnergyNumericUnknownUnknown<10 (1D)NN
164TimeControl/OptimizationForceNumericUnknownUnknown<10 (1D)SVM
165TimeControl/OptimizationForceNumericUnknownUnknown<10 (1D)NN
166TimeControl/OptimizationSoundNumericUnknownUnknown<10 (1D)SVM
167TimeControl/OptimizationSoundNumericUnknownUnknown<10 (1D)NN
168TimeControl/OptimizationVibrationNumericUnknownUnknown<10 (1D)SVM
169TimeControl/OptimizationVibrationNumericUnknownUnknown<10 (1D)NN
170FaultPredictionEnergyNumericUnknownUnknown<10 (1D)SVM
171FaultPredictionEnergyNumericUnknownUnknown<10 (1D)NN
172FaultPredictionForceNumericUnknownUnknown<10 (1D)SVM
173FaultPredictionForceNumericUnknownUnknown<10 (1D)NN
174FaultPredictionSoundNumericUnknownUnknown<10 (1D)SVM
175FaultPredictionSoundNumericUnknownUnknown<10 (1D)NN
176FaultPredictionVibrationNumericUnknownUnknown<10 (1D)SVM
177FaultPredictionVibrationNumericUnknownUnknown<10 (1D)NN
178FaultControl/OptimizationEnergyNumericUnknownUnknown<10 (1D)SVM
179FaultControl/OptimizationEnergyNumericUnknownUnknown<10 (1D)NN
180FaultControl/OptimizationForceNumericUnknownUnknown<10 (1D)SVM
181FaultControl/OptimizationForceNumericUnknownUnknown<10 (1D)NN
182FaultControl/OptimizationSoundNumericUnknownUnknown<10 (1D)SVM
183FaultControl/OptimizationSoundNumericUnknownUnknown<10 (1D)NN
184FaultControl/OptimizationVibrationNumericUnknownUnknown<10 (1D)SVM
185FaultControl/OptimizationVibrationNumericUnknownUnknown<10 (1D)NN
18643[86]QualityMonitoringCameraImageUnknownUnknown>100 × 100 (2D)Tree
18744[87]FaultPredictionForceNumeric<0.01 s<1 day<10 (1D)Tree
18845[88]QualityMonitoringEnergyNumericUnknownUnknown<10 (1D)NN
189QualityMonitoringSoundNumericUnknownUnknown<10 (1D)NN
190QualityMonitoringVibrationNumericUnknownUnknown<10 (1D)NN
191FaultMonitoringEnergyNumericUnknownUnknown<10 (1D)NN
192FaultMonitoringSoundNumericUnknownUnknown<10 (1D)NN
193FaultMonitoringVibrationNumericUnknownUnknown<10 (1D)NN
194FaultPredictionEnergyNumericUnknownUnknown<10 (1D)NN
195FaultPredictionSoundNumericUnknownUnknown<10 (1D)NN
196FaultPredictionVibrationNumericUnknownUnknown<10 (1D)NN
19746[89]QualityPredictionOthersNumericUnknownUnknown<10 (1D)SVM
198QualityPredictionOthersNumericUnknownUnknown<10 (1D)NN
199QualityControl/OptimizationOthersNumericUnknownUnknown<10 (1D)SVM
200QualityControl/OptimizationOthersNumericUnknownUnknown<10 (1D)NN
20147[90]QualityMonitoringVibrationNumeric<0.01 sUnknown<10 (1D)NN
202QualityPredictionVibrationNumeric<0.01 sUnknown<10 (1D)NN
20348[91]QualityPredictionCameraImageUnknownUnknown>100 × 100 (2D)NN
20449[92]FaultDiagnosisForceNumeric<0.01 sUnknown<10 (1D)NN
20550[93]QualityPredictionForceNumeric<0.01 sUnknown<10 (1D)NN
206QualityPredictionVibrationNumeric<0.01 sUnknown<10 (1D)NN
20751[94]EnergyPredictionOthersNumericUnknownUnknown<10 (1D)SVM
208EnergyControl/OptimizationOthersNumericUnknownUnknown<10 (1D)SVM
20952[95]QualityPredictionOthersStringUnknownUnknown<10 (1D)NN
210QualityControl/OptimizationOthersStringUnknownUnknown<10 (1D)NN
211QualityPredictionOthersNumericUnknownUnknown<10 (1D)NN
212QualityControl/OptimizationOthersNumericUnknownUnknown<10 (1D)NN
21353[96]QualityMonitoringVibrationNumeric<0.01 s<1 day<10 (1D)NN
214QualityPredictionVibrationNumeric<0.01 s<1 day<10 (1D)NN
21554[97]QualityPredictionOthersNumericUnknownUnknown<10 (1D)NN
216FaultPredictionOthersNumericUnknownUnknown<10 (1D)NN
21755[98]QualityPredictionOthersNumericUnknownUnknown<10 (1D)NN

References

  1. Peretz-Andersson, E.; Tabares, S.; Mikalef, P.; Parida, V. Artificial Intelligence Implementation in Manufacturing SMEs: A Resource Orchestration Approach. Int. J. Inf. Manag. 2024, 77, 102781. [Google Scholar] [CrossRef]
  2. Jing, H.; Zhang, S. The Impact of Artificial Intelligence on ESG Performance of Manufacturing Firms: The Mediating Role of Ambidextrous Green Innovation. Systems 2024, 12, 499. [Google Scholar] [CrossRef]
  3. Gao, Y.; Liu, Y.; Wu, W. How Does Artificial Intelligence Capability Affect Product Innovation in Manufacturing Enterprises? Evidence from China. Systems 2025, 13, 480. [Google Scholar] [CrossRef]
  4. Heimberger, H.; Horvat, D.; Schultmann, F. Exploring the Factors Driving AI Adoption in Production: A Systematic Literature Review and Future Research Agenda. Inf. Technol. Manag. 2024. [Google Scholar] [CrossRef]
  5. Di Bella, L.; Katsinis, A.; Lagüera González, J.; European Commission. Annual Report on European SMEs 2022/2023; Publications Office of the European Union: Luxembourg, 2023; ISBN 978-92-9469-591-8.
  6. US Census Bureau 2021 SUSB Annual Data Tables by Establishment Industry. Available online: https://www.census.gov/data/tables/2021/econ/susb/2021-susb-annual.html (accessed on 2 March 2025).
  7. Sturm, T.; Pumplun, L.; Gerlach, J.P.; Kowalczyk, M.; Buxmann, P. Machine Learning Advice in Managerial Decision-Making: The Overlooked Role of Decision Makers’ Advice Utilization. J. Strateg. Inf. Syst. 2023, 32, 101790. [Google Scholar] [CrossRef]
  8. Gao, R.X.; Krüger, J.; Merklein, M.; Möhring, H.-C.; Váncza, J. Artificial Intelligence in Manufacturing: State of the Art, Perspectives, and Future Directions. CIRP Ann. 2024, 73, 723–749. [Google Scholar] [CrossRef]
  9. Zhang, D.; Gu, R. Behavioral Preference in Sequential Decision-Making and Its Association with Anxiety. Hum. Brain Mapp. 2018, 39, 2482–2499. [Google Scholar] [CrossRef]
  10. Malyshev, V.V.; Piyavsky, B.S.; Piyavsky, S.A. A Decision Making Method under Conditions of Diversity of Means of Reducing Uncertainty. J. Comput. Syst. Sci. Int. 2010, 49, 44–58. [Google Scholar] [CrossRef]
  11. Wen, M.; Lin, R.; Wang, H.; Yang, Y.; Wen, Y.; Mai, L.; Wang, J.; Zhang, H.; Zhang, W. Large Sequence Models for Sequential Decision-Making: A Survey. Front. Comput. Sci. 2023, 17, 176349. [Google Scholar] [CrossRef]
  12. Carlsson, C.; Fullér, R. Multiple Criteria Decision Making: The Case for Interdependence. Comput. Oper. Res. 1995, 22, 251–260. [Google Scholar] [CrossRef]
  13. MacKay, D.J.C. Information Theory, Inference and Learning Algorithms; Cambridge University Press: Cambridge, UK, 2003; ISBN 978-0-521-64298-9. [Google Scholar]
  14. Sutton, R.S.; Barto, A.G. The Reinforcement Learning Problem; MIT Press: Cambridge, MA, USA, 1998; ISBN 978-0-262-25705-3. [Google Scholar]
  15. Huang, S.; Lin, F. The Design and Evaluation of an Intelligent Sales Agent for Online Persuasion and Negotiation. Electron. Commer. Res. Appl. 2007, 6, 285–296. [Google Scholar] [CrossRef]
  16. Li, P.; Ren, S.; Zhang, Q.; Wang, X.; Liu, Y. Think4SCND: Reinforcement Learning with Thinking Model for Dynamic Supply Chain Network Design. IEEE Access 2024, 12, 195974–195985. [Google Scholar] [CrossRef]
  17. Chen, L.; Lu, K.; Rajeswaran, A.; Lee, K.; Grover, A.; Laskin, M.; Abbeel, P.; Srinivas, A.; Mordatch, I. Decision Transformer: Reinforcement Learning via Sequence Modeling. In Proceedings of the 35th Conference on Neural Information Processing Systems, Online, 6–14 December 2021; pp. 15084–15097. [Google Scholar]
  18. Lee, K.-H.; Nachum, O.; Yang, M.S.; Lee, L.; Freeman, D.; Guadarrama, S.; Fischer, I.; Xu, W.; Jang, E.; Michalewski, H. Multi-Game Decision Transformers. In Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS 2022), New Orleans, LA, USA, 28 November–9 December 2022; pp. 27921–27936. [Google Scholar]
  19. Janner, M.; Li, Q.; Levine, S. Offline Reinforcement Learning as One Big Sequence Modeling Problem. In Proceedings of the 35th Conference on Neural Information Processing Systems, Online, 6–14 December 2021; pp. 1273–1286. [Google Scholar]
  20. Shu, H.; Liu, T.; Mu, X.; Cao, D. Driving Tasks Transfer Using Deep Reinforcement Learning for Decision-Making of Autonomous Vehicles in Unsignalized Intersection. IEEE Trans. Veh. Technol. 2022, 71, 41–52. [Google Scholar] [CrossRef]
  21. Zhang, N.; Yan, J.; Hu, C.; Sun, Q.; Yang, L.; Gao, D.W.; Guerrero, J.M.; Li, Y. Price-Matching-Based Regional Energy Market with Hierarchical Reinforcement Learning Algorithm. IEEE Trans. Ind. Inform. 2024, 20, 11103–11114. [Google Scholar] [CrossRef]
  22. Jiang, D.; Wang, H.; Li, T.; Gouda, M.A.; Zhou, B. Real-Time Tracker of Chicken for Poultry Based on Attention Mechanism-Enhanced YOLO-Chicken Algorithm. Comput. Electron. Agric. 2025, 237, 110640. [Google Scholar] [CrossRef]
  23. Zhang, R.; Torabi, F.; Warnell, G.; Stone, P. Recent Advances in Leveraging Human Guidance for Sequential Decision-Making Tasks. Auton. Agents Multi-Agent Syst. 2021, 35, 31. [Google Scholar] [CrossRef]
  24. Vergara, J.R.; Estévez, P.A. A Review of Feature Selection Methods Based on Mutual Information. Neural Comput. Appl. 2014, 24, 175–186. [Google Scholar] [CrossRef]
  25. Qian, W.; Shu, W. Mutual Information Criterion for Feature Selection from Incomplete Data. Neurocomputing 2015, 168, 210–220. [Google Scholar] [CrossRef]
  26. Todorov, D.; Setchi, R. Time-Efficient Estimation of Conditional Mutual Information for Variable Selection in Classification. Comput. Stat. Data Anal. 2014, 72, 105–127. [Google Scholar] [CrossRef]
  27. Qiu, P.; Niu, Z. TCIC_FS: Total Correlation Information Coefficient-Based Feature Selection Method for High-Dimensional Data. Knowl.-Based Syst. 2021, 231, 107418. [Google Scholar] [CrossRef]
  28. Li, Q.; Steeg, G.V.; Malo, J. Functional Connectivity via Total Correlation: Analytical Results in Visual Areas. Neurocomputing 2024, 571, 127143. [Google Scholar] [CrossRef]
  29. Dai, J.; Xu, Q.; Wang, W.; Tian, H. Conditional Entropy for Incomplete Decision Systems and Its Application in Data Mining. Int. J. Gen. Syst. 2012, 41, 713–728. [Google Scholar] [CrossRef]
  30. Yang, B.; Qi, G.; Xie, B. The Pseudo-Information Entropy of Z-Number and Its Applications in Multi-Attribute Decision-Making. Inf. Sci. 2024, 655, 119886. [Google Scholar] [CrossRef]
  31. Feng, Z.; Ding, X.; Zhang, H.; Liu, Y.; Yan, W.; Jiang, X. An Energy Consumption Estimation Method for the Tool Setting Process in CNC Milling Based on the Modular Arrangement of Predetermined Time Standards. Energies 2023, 16, 7064. [Google Scholar] [CrossRef]
  32. Bertsimas, D.; Öztürk, B. Global Optimization via Optimal Decision Trees. J. Glob. Optim. 2023. [Google Scholar] [CrossRef]
  33. Saha, D.; Manickavasagan, A. Machine Learning Techniques for Analysis of Hyperspectral Images to Determine Quality of Food Products: A Review. Curr. Res. Food Sci. 2021, 4, 28–44. [Google Scholar] [CrossRef]
  34. Chang, H.S.; Fu, M.C.; Hu, J.; Marcus, S.I. A Survey of Some Simulation-Based Algorithms for Markov Decision Processes. Commun. Inf. Syst. 2007, 7, 59–92. [Google Scholar] [CrossRef]
  35. Sun, Z.; Li, L. Potential Capability Estimation for Real Time Electricity Demand Response of Sustainable Manufacturing Systems Using Markov Decision Process. J. Clean. Prod. 2014, 65, 184–193. [Google Scholar] [CrossRef]
  36. Filippi, S.; Cappé, O.; Garivier, A. Optimism in Reinforcement Learning and Kullback-Leibler Divergence. In Proceedings of the 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 29 September–1 October 2010; pp. 115–122. [Google Scholar]
  37. Li, Y.; Shan, S.; Liu, Q.; Oliva, J.B. Towards Robust Active Feature Acquisition. arXiv 2021, arXiv:2107.04163. [Google Scholar] [CrossRef]
  38. Maliah, S.; Shani, G. MDP-Based Cost Sensitive Classification Using Decision Trees. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar] [CrossRef]
  39. Gosavi, A. Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning. In Operations Research/Computer Science Interfaces Series; Springer: Boston, MA, USA, 2015; Volume 55, ISBN 978-1-4899-7490-7. [Google Scholar]
  40. Khriji, L.; Touati, F.; Benhmed, K.; Al-Yahmedi, A. Mobile Robot Navigation Based on Q-Learning Technique. Int. J. Adv. Robot. Syst. 2011, 8, 4. [Google Scholar] [CrossRef]
  41. Chakole, J.B.; Kolhe, M.S.; Mahapurush, G.D.; Yadav, A.; Kurhekar, M.P. A Q-Learning Agent for Automated Trading in Equity Stock Markets. Expert Syst. Appl. 2021, 163, 113761. [Google Scholar] [CrossRef]
  42. Wang, Y.-H.; Li, T.-H.S.; Lin, C.-J. Backward Q-Learning: The Combination of Sarsa Algorithm and Q-Learning. Eng. Appl. Artif. Intell. 2013, 26, 2184–2193. [Google Scholar] [CrossRef]
  43. Doltsinis, S.; Ferreira, P.; Lohse, N. An MDP Model-Based Reinforcement Learning Approach for Production Station Ramp-Up Optimization: Q-Learning Analysis. IEEE Trans. Syst. Man Cybern. Syst. 2014, 44, 1125–1138. [Google Scholar] [CrossRef]
  44. Stavropoulos, P.; Souflas, T.; Papaioannou, C.; Bikas, H.; Mourtzis, D. An Adaptive, Artificial Intelligence-Based Chatter Detection Method for Milling Operations. Int. J. Adv. Manuf. Technol. 2023, 124, 2037–2058. [Google Scholar] [CrossRef]
  45. Huang, Z.; Shao, J.; Guo, W.; Li, W.; Zhu, J.; Fang, D. Hybrid Machine Learning-Enabled Multi-Information Fusion for Indirect Measurement of Tool Flank Wear in Milling. Measurement 2023, 206, 112255. [Google Scholar] [CrossRef]
  46. Wong, S.Y.; Chuah, J.H.; Yap, H.J.; Tan, C.F. Dissociation Artificial Neural Network for Tool Wear Estimation in CNC Milling. Int. J. Adv. Manuf. Technol. 2023, 125, 887–901. [Google Scholar] [CrossRef]
  47. Zheng, X.; Arrazola, P.; Perez, R.; Echebarria, D.; Kiritsis, D.; Aristimuño, P.; Sáez-de-Buruaga, M. Exploring the Effectiveness of Using Internal CNC System Signals for Chatter Detection in Milling Process. Mech. Syst. Signal Process. 2023, 185, 109812. [Google Scholar] [CrossRef]
  48. Tran, M.-Q.; Liu, M.-K.; Elsisi, M. Effective Multi-Sensor Data Fusion for Chatter Detection in Milling Process. ISA Trans. 2022, 125, 514–527. [Google Scholar] [CrossRef] [PubMed]
  49. Bhandari, B.; Park, G. Development of a Real-Time Security Management System for Restricted Access Areas Using Computer Vision and Deep Learning. J. Transp. Saf. Secur. 2022, 14, 655–670. [Google Scholar] [CrossRef]
  50. Checa, D.; Urbikain, G.; Beranoagirre, A.; Bustillo, A.; López de Lacalle, L.N. Using Machine-Learning Techniques and Virtual Reality to Design Cutting Tools for Energy Optimization in Milling Operations. Int. J. Comput. Integr. Manuf. 2022, 35, 951–971. [Google Scholar] [CrossRef]
  51. He, Z.; Shi, T.; Xuan, J. Milling Tool Wear Prediction Using Multi-Sensor Feature Fusion Based on Stacked Sparse Autoencoders. Measurement 2022, 190, 110719. [Google Scholar] [CrossRef]
  52. Gauder, D.; Biehler, M.; Gölz, J.; Schulze, V.; Lanza, G. In-Process Acoustic Pore Detection in Milling Using Deep Learning. CIRP J. Manuf. Sci. Technol. 2022, 37, 125–133. [Google Scholar] [CrossRef]
  53. Sestito, G.S.; Venter, G.S.; Ribeiro, K.S.B.; Rodrigues, A.R.; da Silva, M.M. In-Process Chatter Detection in Micro-Milling Using Acoustic Emission via Machine Learning Classifiers. Int. J. Adv. Manuf. Technol. 2022, 120, 7293–7303. [Google Scholar] [CrossRef]
  54. Han, Z.; Zhuo, Y.; Yan, Y.; Jin, H.; Fu, H. Chatter Detection in Milling of Thin-Walled Parts Using Multi-Channel Feature Fusion and Temporal Attention-Based Network. Mech. Syst. Signal Process. 2022, 179, 109367. [Google Scholar] [CrossRef]
  55. Yang, C.; Zhou, J.; Li, E.; Zhang, H.; Wang, M.; Li, Z. Milling Cutter Wear Prediction Method under Variable Working Conditions Based on LRCN. Int. J. Adv. Manuf. Technol. 2022, 121, 2647–2661. [Google Scholar] [CrossRef]
  56. Vaishnav, S.; Desai, K.A. Long Short-Term Memory-Based Cutting Depth Monitoring System for End Milling Operation. J. Comput. Inf. Sci. Eng. 2022, 22, 051001. [Google Scholar] [CrossRef]
  57. Ma, K.; Wang, G.; Yang, K.; Hu, M.; Li, J. Tool Wear Monitoring for Cavity Milling Based on Vibration Singularity Analysis and Stacked LSTM. Int. J. Adv. Manuf. Technol. 2022, 120, 4023–4039. [Google Scholar] [CrossRef]
  58. Pan, T.; Zhang, J.; Zhang, X.; Zhao, W.; Zhang, H.; Lu, B. Milling Force Coefficients-Based Tool Wear Monitoring for Variable Parameter Milling. Int. J. Adv. Manuf. Technol. 2022, 120, 4565–4580. [Google Scholar] [CrossRef]
  59. Shah, M.; Vakharia, V.; Chaudhari, R.; Vora, J.; Pimenov, D.Y.; Giasin, K. Tool Wear Prediction in Face Milling of Stainless Steel Using Singular Generative Adversarial Network and LSTM Deep Learning Models. Int. J. Adv. Manuf. Technol. 2022, 121, 723–736. [Google Scholar] [CrossRef]
  60. Kim, Y.; Kim, T.; Youn, B.D.; Ahn, S.-H. Machining Quality Monitoring (MQM) in Laser-Assisted Micro-Milling of Glass Using Cutting Force Signals: An Image-Based Deep Transfer Learning. J. Intell. Manuf. 2022, 33, 1813–1828. [Google Scholar] [CrossRef]
  61. Qazani, M.R.C.; Pourmostaghimi, V.; Moayyedian, M.; Pedrammehr, S. Estimation of Tool–Chip Contact Length Using Optimized Machine Learning in Orthogonal Cutting. Eng. Appl. Artif. Intell. 2022, 114, 105118. [Google Scholar] [CrossRef]
  62. Zhang, X.; Yu, T.; Xu, P.; Zhao, J. In-Process Stochastic Tool Wear Identification and Its Application to the Improved Cutting Force Modeling of Micro Milling. Mech. Syst. Signal Process. 2022, 164, 108233. [Google Scholar] [CrossRef]
  63. Nguyen, A.-T.; Nguyen, V.-H.; Le, T.-T.; Nguyen, N.-T. Multiobjective Optimization of Surface Roughness and Tool Wear in High-Speed Milling of AA6061 by Machine Learning and NSGA-II. Adv. Mater. Sci. Eng. 2022, 2022, 5406570. [Google Scholar] [CrossRef]
  64. Mahmood, J.; Mustafa, G.; Ali, M. Accurate Estimation of Tool Wear Levels during Milling, Drilling and Turning Operations by Designing Novel Hyperparameter Tuned Models Based on LightGBM and Stacking. Measurement 2022, 190, 110722. [Google Scholar] [CrossRef]
  65. Sarat Babu, M.; Babu Rao, T. Multi-Sensor Heterogeneous Data-Based Online Tool Health Monitoring in Milling of IN718 Superalloy Using OGM (1, N) Model and SVM. Measurement 2022, 199, 111501. [Google Scholar] [CrossRef]
  66. Peng, Y.; Song, Q.; Wang, R.; Liu, Z.; Liu, Z. Intelligent Recognition of Tool Wear in Milling Based on a Single Sensor Signal. Int. J. Adv. Manuf. Technol. 2023, 124, 1077–1093. [Google Scholar] [CrossRef]
  67. Li, E.; Zhou, J.; Yang, C.; Wang, M.; Li, Z.; Zhang, H.; Jiang, T. CNN-GRU Network-Based Force Prediction Approach for Variable Working Condition Milling Clamping Points of Deformable Parts. Int. J. Adv. Manuf. Technol. 2022, 119, 7843–7863. [Google Scholar] [CrossRef]
  68. Carbone, N.; Bernini, L.; Albertelli, P.; Monno, M. Assessment of Milling Condition by Image Processing of the Produced Surfaces. Int. J. Adv. Manuf. Technol. 2023, 124, 1681–1697. [Google Scholar] [CrossRef]
  69. Li, Y.; Bao, J.; Chen, T.; Yu, A.; Yang, R. Prediction of Ball Milling Performance by a Convolutional Neural Network Model and Transfer Learning. Powder Technol. 2022, 403, 117409. [Google Scholar] [CrossRef]
  70. Peng, D.; Li, H.; Dai, Y.; Wang, Z.; Ou, J. Prediction of Milling Force Based on Spindle Current Signal by Neural Networks. Measurement 2022, 205, 112153. [Google Scholar] [CrossRef]
  71. Bai, L.; Xu, F.; Chen, X.; Su, X.; Lai, F.; Xu, J. A Hybrid Deep Learning Model for Robust Prediction of the Dimensional Accuracy in Precision Milling of Thin-Walled Structural Components. Front. Mech. Eng. 2022, 17, 32. [Google Scholar] [CrossRef]
  72. Sener, B.; Gudelek, M.U.; Ozbayoglu, A.M.; Unver, H.O. A Novel Chatter Detection Method for Milling Using Deep Convolution Neural Networks. Measurement 2021, 182, 109689. [Google Scholar] [CrossRef]
  73. Ma, J.; Luo, D.; Liao, X.; Zhang, Z.; Huang, Y.; Lu, J. Tool Wear Mechanism and Prediction in Milling TC18 Titanium Alloy Using Deep Learning. Measurement 2021, 173, 108554. [Google Scholar] [CrossRef]
  74. Huang, Z.; Zhu, J.; Lei, J.; Li, X.; Tian, F. Tool Wear Monitoring with Vibration Signals Based on Short-Time Fourier Transform and Deep Convolutional Neural Network in Milling. Math. Probl. Eng. 2021, 2021, 9976939. [Google Scholar] [CrossRef]
  75. Guo, S.; Zheng, H.; Liu, X.; Gu, L. Comparison on Milling Force Model Prediction of New Cold Saw Blade Milling Cutter Based on Deep Neural Network and Regression Analysis. Manuf. Technol. 2021, 21, 456–463. [Google Scholar] [CrossRef]
  76. Marani, M.; Zeinali, M.; Songmene, V.; Mechefske, C.K. Tool Wear Prediction in High-Speed Turning of a Steel Alloy Using Long Short-Term Memory Modelling. Measurement 2021, 177, 109329. [Google Scholar] [CrossRef]
  77. Charalampous, P. Prediction of Cutting Forces in Milling Using Machine Learning Algorithms and Finite Element Analysis. J. Mater. Eng. Perform. 2021, 30, 2002–2013. [Google Scholar] [CrossRef]
  78. Vashisht, R.K.; Peng, Q. Online Chatter Detection for Milling Operations Using LSTM Neural Networks Assisted by Motor Current Signals of Ball Screw Drives. J. Manuf. Sci. Eng. 2020, 143, 011008. [Google Scholar] [CrossRef]
  79. Eser, A.; Aşkar Ayyıldız, E.; Ayyıldız, M.; Kara, F. Artificial Intelligence-Based Surface Roughness Estimation Modelling for Milling of AA6061 Alloy. Adv. Mater. Sci. Eng. 2021, 2021, 5576600. [Google Scholar] [CrossRef]
  80. Möhring, H.-C.; Eschelbacher, S.; Georgi, P. Machine Learning Approaches for Real-Time Monitoring and Evaluation of Surface Roughness Using a Sensory Milling Tool. Procedia CIRP 2021, 102, 264–269. [Google Scholar] [CrossRef]
  81. Chen, Y.; Yi, H.; Liao, C.; Huang, P.; Chen, Q. Visual Measurement of Milling Surface Roughness Based on Xception Model with Convolutional Neural Network. Measurement 2021, 186, 110217. [Google Scholar] [CrossRef]
  82. Wang, J.; Zou, B.; Liu, M.; Li, Y.; Ding, H.; Xue, K. Milling Force Prediction Model Based on Transfer Learning and Neural Network. J. Intell. Manuf. 2021, 32, 947–956. [Google Scholar] [CrossRef]
  83. Yan, B.; Zhu, L.; Dun, Y. Tool Wear Monitoring of TC4 Titanium Alloy Milling Process Based on Multi-Channel Signal and Time-Dependent Properties by Using Deep Learning. J. Manuf. Syst. 2021, 61, 495–508. [Google Scholar] [CrossRef]
  84. Rahimi, M.H.; Huynh, H.N.; Altintas, Y. On-Line Chatter Detection in Milling with Hybrid Machine Learning and Physics-Based Model. CIRP J. Manuf. Sci. Technol. 2021, 35, 25–40. [Google Scholar] [CrossRef]
  85. Sayyad, S.; Kumar, S.; Bongale, A.; Kamat, P.; Patil, S.; Kotecha, K. Data-Driven Remaining Useful Life Estimation for Milling Process: Sensors, Algorithms, Datasets, and Future Directions. IEEE Access 2021, 9, 110255–110286. [Google Scholar] [CrossRef]
  86. Riego, V.; Castejón-Limas, M.; Sánchez-González, L.; Fernández-Robles, L.; Perez, H.; Diez-Gonzalez, J.; Guerrero-Higueras, Á.-M. Strong Classification System for Wear Identification on Milling Processes Using Computer Vision and Ensemble Learning. Neurocomputing 2021, 456, 678–684. [Google Scholar] [CrossRef]
  87. Varghese, A.; Kulkarni, V.; Joshi, S.S. Tool Life Stage Prediction in Micro-Milling From Force Signal Analysis Using Machine Learning Methods. J. Manuf. Sci. Eng. 2020, 143, 054501. [Google Scholar] [CrossRef]
  88. Traini, E.; Bruno, G.; Lombardi, F. Tool Condition Monitoring Framework for Predictive Maintenance: A Case Study on Milling Process. Int. J. Prod. Res. 2021, 59, 7179–7193. [Google Scholar] [CrossRef]
  89. Karthik, R.M.C.; Malghan, R.L.; Kara, F.; Shettigar, A.; Rao, S.S.; Herbert, M.A. Influence of Support Vector Regression (SVR) on Cryogenic Face Milling. Adv. Mater. Sci. Eng. 2021, 2021, 9984369. [Google Scholar] [CrossRef]
  90. Zheng, G.; Sun, W.; Zhang, H.; Zhou, Y.; Gao, C. Tool Wear Condition Monitoring in Milling Process Based on Data Fusion Enhanced Long Short-Term Memory Network under Different Cutting Conditions. Maint. Reliab. Eksploat. Niezawodn. 2021, 23, 612–618. [Google Scholar] [CrossRef]
  91. Rifai, A.P.; Aoyama, H.; Tho, N.H.; Dawal, S.Z.M.; Masruroh, N.A. Evaluation of Turned and Milled Surfaces Roughness Using Convolutional Neural Network. Measurement 2020, 161, 107860. [Google Scholar] [CrossRef]
  92. Tran, M.-Q.; Liu, M.-K.; Tran, Q.-V. Milling Chatter Detection Using Scalogram and Deep Convolutional Neural Network. Int. J. Adv. Manuf. Technol. 2020, 107, 1505–1516. [Google Scholar] [CrossRef]
  93. Huang, Z.; Zhu, J.; Lei, J.; Li, X.; Tian, F. Tool Wear Predicting Based on Multi-Domain Feature Fusion by Deep Convolutional Neural Network in Milling Operations. J. Intell. Manuf. 2020, 31, 953–966. [Google Scholar] [CrossRef]
  94. Hossein Rabiee, A.; Tahmasbi, V.; Qasemi, M. Experimental Evaluation, Modeling and Sensitivity Analysis of Temperature and Cutting Force in Bone Micro-Milling Using Support Vector Regression and EFAST Methods. Eng. Appl. Artif. Intell. 2023, 120, 105874. [Google Scholar] [CrossRef]
  95. Xie, J.; Hu, P.; Chen, J.; Han, W.; Wang, R. Deep Learning-Based Instantaneous Cutting Force Modeling of Three-Axis CNC Milling. Int. J. Mech. Sci. 2023, 246, 108153. [Google Scholar] [CrossRef]
  96. Li, B.; Liu, T.; Liao, J.; Feng, C.; Yao, L.; Zhang, J. Non-Invasive Milling Force Monitoring through Spindle Vibration with LSTM and DNN in CNC Machine Tools. Measurement 2023, 210, 112554. [Google Scholar] [CrossRef]
  97. Khoshaim, A.B.; Elsheikh, A.H.; Moustafa, E.B.; Basha, M.; Mosleh, A.O. Prediction of Residual Stresses in Turning of Pure Iron Using Artificial Intelligence-Based Methods. J. Mater. Res. Technol. 2021, 11, 2181–2194. [Google Scholar] [CrossRef]
  98. Elsheikh, A.H.; Muthuramalingam, T.; Shanmugan, S.; Mahmoud Ibrahim, A.M.; Ramesh, B.; Khoshaim, A.B.; Moustafa, E.B.; Bedairi, B.; Panchal, H.; Sathyamurthy, R. Fine-Tuned Artificial Intelligence Model Using Pigeon Optimizer for Prediction of Residual Stresses during Turning of Inconel 718. J. Mater. Res. Technol. 2021, 15, 3622–3634. [Google Scholar] [CrossRef]
Figure 1. Flowchart of the Q-learning-based algorithm of the SDM model.
Figure 1. Flowchart of the Q-learning-based algorithm of the SDM model.
Systems 13 00830 g001
Figure 2. Four cases according to knowledge of the decision order.
Figure 2. Four cases according to knowledge of the decision order.
Systems 13 00830 g002
Figure 3. Example episodes of the four SDM simulation cases.
Figure 3. Example episodes of the four SDM simulation cases.
Systems 13 00830 g003
Figure 4. Adjusted CE by simulation case and decision stage.
Figure 4. Adjusted CE by simulation case and decision stage.
Systems 13 00830 g004
Table 1. Summary of literature review on decision-making based on information theory.
Table 1. Summary of literature review on decision-making based on information theory.
ReferenceUsed MetricPurposeProblem Type
[25]MIVariable selectionClassification accuracy improvement
[26]MIVariable selectionClassification accuracy improvement
[27]TCVariable selectionClassification accuracy improvement
[28]MI and TCDependency analysisFunctional connectivity analysis
[29]CEAttribute selectionClassification accuracy improvement
[30]CEUncertainty quantificationFuzzy decision modeling
This studyCEUncertainty reductionSequential decision-making
Table 2. Key properties of MI, TC, and CE.
Table 2. Key properties of MI, TC, and CE.
MetricApplicable Number of VariablesMeasured InformationDependency Type
MI2Shared informationMutual
TC≥2Redundant informationJoint
CE≥2UncertaintyConditional
Table 3. Decision-making elements and attributes for AI adoption in manufacturing.
Table 3. Decision-making elements and attributes for AI adoption in manufacturing.
ElementsAttributes
PurposeCost, Energy, Fault, Time, Quality
TaskControl/Optimization, Diagnosis, Monitoring, Prediction
SensorCamera, Energy, Force, Position, Pressure, Rotation, Sound, Velocity/Acceleration, Vibration, Others
Data typeImage, Numeric, String
Data collection interval<0.01 s, 0.01~1 s, >1 s
Data collection period<1 day, 1 day~7 days, 7 days~1 month, >1 month
Data dimension<10 (1D), 10~100 (1D), >100 (1D), <100 × 100 (2D), >100 × 100 (2D)
AI techniqueClustering, kNN, Linear, NN, SVM, Tree, Others
Table 4. Categorical variables data example used for model development.
Table 4. Categorical variables data example used for model development.
No.PurposeTaskSensorData TypeData
Collection
Interval
Data Collection PeriodData DimensionAI
Technique
152921115
252321015
352321017
452921015
552921017
Table 5. Nomenclature of symbols used in the RL-based SDM model.
Table 5. Nomenclature of symbols used in the RL-based SDM model.
SymbolDescription
E Full set of decision-making elements
e Single decision-making element in the set E
S Set of elements already selected
s Current state (i.e., E \ S )
x Element selected as an action
a Action (i.e., selecting decision element)
r ( x ) Reward for selecting element x
u x Proportion of unknown attributes in element x
H ( e ) Entropy of element e
H e S Conditional entropy of ee given selection set S
Table 6. Design of the RL-based SDM model.
Table 6. Design of the RL-based SDM model.
State s = E \ S
Action a = x E \ S
Reward r ( x ) = e E \ ( S x ) H e S H e S x ( 1 u x )
Table 7. Hyperparameters of the Q-learning model.
Table 7. Hyperparameters of the Q-learning model.
Learning Rate ( α )0.5
Discount Factor ( γ )0.9
Exploration ε -greedy (initial value: 1.0, decay rate: 0.9, minimum value: 0.5)
Table 8. Stage-wise summary of Q-values for six decision elements.
Table 8. Stage-wise summary of Q-values for six decision elements.
SensorData TypeData
Collection Interval
Data
Collection Period
Data
Dimension
AI
Technique
Stage 14.7704.4423.2541.2973.8733.283
Stage 2 0.3830.5250.2220.2740.438
Stage 3 0.115 0.0960.1280.127
Stage 4 0.031 0.032 0.032
Stage 5 0.000 0.000
Stage 6 0.000
Table 9. List of rule-based constraints applied in the simulation.
Table 9. List of rule-based constraints applied in the simulation.
No.If
(Trigger Condition)
Then
(Constraint)
1Sensor = CameraData type = Image
2Sensor = CameraData dimension {<100 × 100 (2D), >100 × 100 (2D)}
3Sensor = SoundData dimension {<10 (1D), 10~100 (1D)}
4Sensor = SoundData collection interval = <0.01 s
5Sensor {Energy, Force, Position, Pressure, Rotation, Sound, Velocity/Acceleration, Vibration}Data type = Numeric
6Data type = ImageData dimension {<100 × 100 (2D), >100 × 100 (2D)}
7Data type = ImageSensor {Camera, Others}
8AI technique ≠ NNData dimension {<10 (1D), 10~100 (1D)}
9Data dimension {<100 × 100 (2D), >100 × 100 (2D)}AI technique = NN
Table 10. Adjusted CE by simulation case and decision stage (with AUC).
Table 10. Adjusted CE by simulation case and decision stage (with AUC).
StageCase (a)Case (b)Case (c)Case (d)
0 (Start)5.5505.5505.5505.550
11.5912.9273.4923.429
20.9731.8012.2632.122
30.8271.0511.3851.311
40.1280.5500.8010.754
50.0740.1870.3560.351
60.0000.0000.0000.000
AUC6.3679.29111.07210.741
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lee, G.-h.; Song, B.; Jeon, H.-w. Conditional Entropy-Based Sequential Decision-Making for AI Adoption in Manufacturing: A Reinforcement Learning Approach. Systems 2025, 13, 830. https://doi.org/10.3390/systems13090830

AMA Style

Lee G-h, Song B, Jeon H-w. Conditional Entropy-Based Sequential Decision-Making for AI Adoption in Manufacturing: A Reinforcement Learning Approach. Systems. 2025; 13(9):830. https://doi.org/10.3390/systems13090830

Chicago/Turabian Style

Lee, Ga-hyun, Byunghun Song, and Hyun-woo Jeon. 2025. "Conditional Entropy-Based Sequential Decision-Making for AI Adoption in Manufacturing: A Reinforcement Learning Approach" Systems 13, no. 9: 830. https://doi.org/10.3390/systems13090830

APA Style

Lee, G.-h., Song, B., & Jeon, H.-w. (2025). Conditional Entropy-Based Sequential Decision-Making for AI Adoption in Manufacturing: A Reinforcement Learning Approach. Systems, 13(9), 830. https://doi.org/10.3390/systems13090830

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop