Priority-Aware Actuation Update Scheme in Heterogeneous Industrial Networks

Kyung, Yeunwoong; Sung, Jihoon; Ko, Haneul; Song, Taewon; Kim, Youngjun

doi:10.3390/s24020357

Open AccessArticle

Priority-Aware Actuation Update Scheme in Heterogeneous Industrial Networks

by

Yeunwoong Kyung

¹

,

Jihoon Sung

²,

Haneul Ko

³

,

Taewon Song

^4,*

and

Youngjun Kim

^5,*

¹

Division of Information & Communication Engineering, Kongju National University, Cheonan-daero, Cheonan 31080, Republic of Korea

²

Department of Electrical Engineering, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea

³

Department of Electronic Engineering, Kyung Hee University, Yongin-si 17104, Republic of Korea

⁴

Department of Internet of Things, SCH MediaLabs, Soonchunhyang University, 22 Soonchunhyang-ro, Shinchang-myeon, Asan-si 31538, Republic of Korea

⁵

School of Computer Science and Engineering, Kyungnam University, Changwon-si 51767, Republic of Korea

^*

Authors to whom correspondence should be addressed.

Sensors 2024, 24(2), 357; https://doi.org/10.3390/s24020357

Submission received: 15 December 2023 / Revised: 4 January 2024 / Accepted: 5 January 2024 / Published: 7 January 2024

(This article belongs to the Section Internet of Things)

Download

Browse Figures

Versions Notes

Abstract

In heterogeneous wireless networked control systems (WNCSs), the age of information (AoI) of the actuation update and actuation update cost are important performance metrics. To reduce the monetary cost, the control system can wait for the availability of a WiFi network for the actuator and then conduct the update using a WiFi network in an opportunistic manner, but this leads to an increased AoI of the actuation update. In addition, since there are different AoI requirements according to the control priorities (i.e., robustness of AoI of the actuation update), these need to be considered when delivering the actuation update. To jointly consider the monetary cost and AoI with priority, this paper proposes a priority-aware actuation update scheme (PAUS) where the control system decides whether to deliver or delay the actuation update to the actuator. For the optimal decision, we formulate a Markov decision process model and derive the optimal policy based on Q-learning, which aims to maximize the average reward that implies the balance between the monetary cost and AoI with priority. Simulation results demonstrate that the PAUS outperforms the comparison schemes in terms of the average reward under various settings.

Keywords:

actuation update; age of information; industrial networks; wireless networked control systems; Markov decision process; Q-learning

1. Introduction

According to Industry 4.0, wireless networked control systems (WNCSs) have been applied to industrial networks for various services, such as industrial automation, smart manufacturing, and unmanned robot control [1,2]. WNCSs have been considered as a prominent solution in industrial networks to provide real-time and reliable actuation [3,4]. Compared to conventional NCSs based on wired networks, since WNCSs are spatially distributed wireless control systems, they have been researched with respect to the enhancement of energy-harvesting capabilities [5], wireless resource scheduling [6], energy-aware performance optimization [7], and wireless attacks [8]. WNCSs generally consist of sensors, actuators, and a controller. Sensors collect the latest samples of environmental states and deliver them to the controller. After the controller computes control decisions for actuators, it sends the control command to the actuators. In addition, as mobile actuators, such as mobile robots and automated guided vehicles, have recently been deployed, wireless control for mobile actuators has been actively deployed. During the general process of WNCSs, there are two principal updates: (1) status updates from sensors to the controller, and (2) actuation updates from the controller to actuators, which need timely updates due to the time-critical control applications in WNCSs.

Since timeliness is an important metric in WNCSs, the age of information (AoI) has been introduced as a novel metric to quantify the freshness of information updates [9,10]. AoI is defined as the amount of elapsed time since the latest delivered information (i.e., updates in WNCSs) was generated. It is based on the perspective of destinations, and therefore it linearly increases with time until an update is received at a destination. Specifically, an update that was generated at time u has AoI t − u at a time t (

t \geq u

). The update is said to be fresh when its AoI is close to zero. Since WNCSs require timely and fresh updates to improve their control performance, AoI has been applied to WNCSs as a key performance metric [11,12].

After AoI was introduced, the research on it for status updates in industrial networks or WNCSs has been maturely studied [2,12,13]. However, research on AoI for actuation updates has not been a focus, even though it is critical for control performance. For example, delayed actuation updates can result in production inefficiency, plant destruction, and casualties [2,13]. In other words, the timeliness of the actuation update should be controlled by the controller in the WNCS. In addition, since the impact of AoI on the actuation update can be different according to the control priorities for the actuators (i.e., robustness of AoI of the actuation update) [14,15], the priority needs to be considered when delivering the actuation update. For example, priority can be defined to classify purposes concerning the criticality or safety level at a particular moment [15]. Accordingly, the actuation updates of high priority are more sensitive to the changes in AoI than those of low priority. Consequently, priorities should be incorporated with AoI.

Meanwhile, in industrial environments, heterogeneous wireless networks, such as cellular (e.g., 5G new radio (NR)) and WiFi networks [16,17], have been deployed. Accordingly, the type of network available for mobile actuators varies depending on the location. In this scenario, the actuation updates via cellular networks engender a monetary cost while the updates via WiFi networks are usually free to use. To reduce the monetary cost, the control system prefers to use WiFi networks for the actuation updates. Note that, compared to cellular networks that offer perfect coverage (i.e., always available), WiFi networks are distributed and therefore can be exploited in an opportunistic manner (i.e., they are intermittently available) [18]. This means that when the control system needs to deliver the actuation update, it can use the cellular network immediately. However, to reduce the monetary cost, the control system can deliver the update via a WiFi network after waiting until a WiFi network has become available, which can increase AoI. As explained above, the increased AoI results in a critical situation, especially for high-priority control commands. Consequently, it is important to determine the appropriate actuation update policy, considering both the monetary cost and AoI with priority.

To address the AoI control problem considering heterogeneous networks, there have been several works [18,19,20,21,22,23]. These can be categorized into the following: (1) status update design [19,20,21,22]; and (2) actuation update design [18,23]. Pan et al. [19] determined the scheduling policy to transmit status updates over an unreliable but fast channel or a slow, reliable channel to minimize AoI. A Markov decision process (MDP) model was exploited to formulate and solve the optimal scheduling problem. Bhati et al. [20] analyzed the average AoI with heterogeneous multiple servers and determined the optimal routing parameter between the servers to minimize the average AoI. For the system model, M/M/1 queuing models with different service rates among the servers were assumed. Fidler et al. [21] showed the effect of independent parallel channels on AoI based on the queuing models. Specifically, G/G/1 queuing models with Markov channels were used for the parallel systems with a time-varying capacity. Xie et al. [22] formulated the generalized scheduling problem in multi-sensor multi-server systems to minimize AoI. This paper jointly considered link scheduling, server selection, and service preemption, and formulated an MDP problem to find the optimal policy. As explained above, these papers addressed the AoI control problem considering heterogeneous networks. However, these papers mainly focused on status updates (i.e., irrespective of the actuation update) and did not consider priority. Altman et al. [18] and Raiss-el-fenni et al. [23] introduced the receiver’s policy to decide whether to receive updates from cellular or WiFi networks to minimize costs. They focused on the receiver’s perspective about whether to activate the device or not. However, it is difficult for the specific application to control device activation without the user’s involvement. Consequently, it is more suitable to determine the policy from the control system (i.e., application server), as proposed in this paper. In addition, these papers did not consider the priority as well.

To address these challenges, this paper proposes a priority-aware actuation update scheme (PAUS) that jointly considers cost and AoI with priority. In the PAUS, the control system determines whether to deliver or delay the actuation update to the actuator based on AoI with priority and cost. We formulate a Markov decision process (MDP) model and determine the optimal policy based on Q-learning (QL). Simulation results demonstrate that the PAUS reduces the cost while satisfying the required AoI.

The main contributions of this paper are as follows: (1) to the best of our knowledge, this is the first work where the actuation update is determined jointly considering AoI and monetary cost; (2) the increasing rate of AoI is determined according to the control priorities (i.e., robustness of AoI of the actuation update) to consider the different impact of AoI with priority on the actuation update; (3) an MDP model is formulated to maximize the average reward that implies the balance between AoI with priority and monetary cost; (4) the optimal policy on whether to deliver or delay the actuation update to the actuator can be obtained using QL; and (5) extensive simulation results present the performance of the PAUS under various settings, which can be utilized as the guidelines for the control system operator.

The remainder of this paper is organized as follows. The system model and problem formulation are provided in Section 2 and Section 3, respectively. The QL-based algorithm is presented in Section 4. After simulation results are provided in Section 5, this paper is concluded with future works suggested in Section 6.

2. System Model

Figure 1 presents the system model of this paper. In our model, a control system (i.e., the controller) delivers the actuation update to the mobile actuator using either the cellular base station (CBS) or a WiFi access point (WAP). As we mentioned above, the CBS is always available, whereas a WAP is only available when the actuator is close enough to the WAP [18]. In addition, since the control system (i.e., the controller) delivers the update, we assume that the transmission energy can be ignored. Moreover, it is assumed that there is no transmission error because the model is focused on the actuation update delivery (i.e., it is not focused on physical communication) [24,25].

The monetary cost should be considered for actuation updates according to the network type. The use of the cellular network (i.e., via the CBS) requires monetary costs based on the data plans of network operators. On the other hand, the use of WiFi networks (i.e., via a WAP) is usually free. Therefore, the actuation update via a WAP is efficient in terms of reducing the monetary cost for the system operator. However, WAPs are intermittently available [18]. Consequently, actuation updates using WAPs in an opportunistic manner can lead to increased AoI while reducing the monetary cost. Since increasing AoI leads to a critical situation (e.g., production inefficiency and casualties [2,13]), it is necessary to maintain low AoI. Moreover, since there are control priorities in actuation updates, the priority should be considered when delivering the actuation updates. For example, an update of a high priority has a stricter AoI requirement than that of a low priority, which is not relatively sensitive to AoI [15]. Note that the AoI requirement is set including the transmission time between the controller and actuator.

Figure 2 shows the specific timing diagram for the PAUS. At each decision epoch (e.g.,

t_{0}

,

t_{1}

, …,

t_{7}

), the control system determines whether to deliver or delay the actuation update, considering the available network of the actuator (i.e., CBS or WAP), the current AoI, and the priority of the update. In Figure 2, the thick horizontal solid lines from the CBS and WAP denote the availability of the network. For example, a WAP is available between

t_{2}

and

t_{3}

, while CBS is always available. When the control system receives a status update, it can wait for a WAP to reduce the monetary cost. If a WAP becomes available, the control system delivers the update via the WAP (e.g., between

t_{2}

and

t_{3}

) without monetary cost. Otherwise, the control system should deliver the update via the CBS (e.g., between

t_{6}

and

t_{7}

) before exceeding the target AoI requirement, even though the update engenders monetary costs.

Therefore, it is important to determine an actuation update policy that can minimize monetary cost while maintaining AoI below a desired value, considering priority. To determine the optimal policy, this paper formulates an MDP problem in the next section.

3. Problem Formulation

In this section, we formulate an MDP model based on the timing diagram in Figure 2. In the formulated MDP model, the actuation update can be delivered via either the CBS or a WAP. Furthermore, if a WAP is not currently available, the control system can delay the update with the expectation of future WAP contact.

Whether to deliver the actuation update (i.e., via the CBS or a WAP) or delay the update is determined at each decision epoch

t \in T = \{1, 2, 3 \dots\}

according to the state at the decision epoch.

3.1. State Space

At each decision epoch, the state set

S

can be defined as

\begin{matrix} S = L \times \prod_{i} V_{i} \times \prod_{i} E_{i} \end{matrix}

(1)

where L denotes the availability of WAPs. In addition,

V_{i}

and

E_{i}

represent the current AoI and the existence of the actuation update with priority i, respectively.

First,

L

can be defined as

\begin{matrix} L = \{0, 1\} \end{matrix}

(2)

where

l (\in L)

represents whether the actuator can receive the information from a WAP or not. In other words,

l = 0

means that the actuator cannot connect to a WAP (i.e., can only connect to the CBS) because there is no available WAP. Otherwise (i.e.,

l = 1

), the actuator can connect to a WAP as well as the CBS because a WAP is close to the actuator.

Moreover,

V_{i}

can be defined as

\begin{matrix} V_{i} = \{0, \dots, V_{m}\} \end{matrix}

(3)

where

v_{i} (\in V_{i})

denotes the current AoI of the actuation update with priority i and

V_{m}

is the maximum AoI in the system model.

In addition,

E_{i}

can be defined as

\begin{matrix} E_{i} = \{0, 1\} \end{matrix}

(4)

where

e_{i} (\in E_{i})

denotes the existence of the actuation update with priority i. In other words,

e_{i} = 1

means that the actuation update with priority i exists at the control system and needs to be delivered to the actuator. Otherwise (i.e.,

e_{i} = 0

), the actuation update with priority i does not exist.

3.2. Action Space

At each decision epoch, the control system determines an action (i.e., deliver or delay). Consequently, let

A = \prod_{i} A_{i}

denote a global action space for the actuator, where

A_{i}

is a local action space of the actuation update with priority i.

A_{i}

can be defined as

\begin{matrix} A_{i} = \{0, 1\} \end{matrix}

(5)

where 0 and 1 stand for defined actions. Specifically,

a_{i} (\in A_{i}) = 0

means that the control system delivers the update to the actuator. On the other hand,

a_{i} = 1

means that the control system delays the update.

3.3. Transition Probability

The transition probability from the current state

s (\in S)

to the next state

s^{'} (\in S)

when the control system chooses the action a can be described as

\begin{matrix} P [s^{'} | s, a] = P [l^{'} | l] \times \prod_{i} P [v_{i}^{'}, e_{i}^{'} | v_{i}, e_{i}, a_{i}], \end{matrix}

(6)

because the availability of a WAP is not dependent on the other states or determined action. In addition, the existence of the update is not dependent on the other states while the current AoI is dependent on the existence of the update. Consequently,

P [v_{i}^{'}, e_{i}^{'} | v_{i}, e_{i}, a_{i}]

can be rearranged as

\begin{matrix} P [v_{i}^{'}, e_{i}^{'} | v_{i}, e_{i}, a_{i}] = P [v_{i}^{'} | v_{i}, e_{i}, a_{i}] \times P [e_{i}^{'} | e_{i}, a_{i}] . \end{matrix}

(7)

We assume that the duration of the disconnection (connection) between a WAP and an actuator follows the exponential distribution with mean

1 / λ^{D}

(

1 / λ^{C}

) [24,25]. Consequently, the probability that the actuator can connect to a WAP during

τ

is

λ^{C} τ

. In addition, the actuator can disconnect with a WAP during

τ

with probability

λ^{D} τ

. Therefore,

P [l^{'} | l = 0]

and

P [l^{'} | l = 1]

can be defined by

\begin{matrix} P [l^{'} | l = 0] = \{\begin{matrix} 1 - λ^{C} τ, if l^{'} = 0 \\ λ^{C} τ, if l^{'} = 1 \end{matrix} \end{matrix}

(8)

and

\begin{matrix} P [l^{'} | l = 1] = \{\begin{matrix} 1 - λ^{D} τ, if l^{'} = 1 \\ λ^{D} τ, if l^{'} = 0 . \end{matrix} \end{matrix}

(9)

If the control system delays the actuation update when the update exists, the current AoI increases until

V_{m}

. If

v_{i}

becomes

V_{m}

, the control system should deliver the actuation update to the actuator. In this paper, AoI increases with different increasing rates according to the priority i. This is because, even if the same amount of time elapses, it can be perceived as relatively more time for the update with a high priority (i.e., higher i) compared to that with a low priority (i.e., lower i). In other words, the increasing rate of high-priority updates (e.g., high criticality level) is higher than that of low-priority updates (e.g., low criticality level) because an increasing AoI of high-priority updates is much more critical compared to that of low-priority updates. Moreover, when the control system delivers the actuation update, the corresponding AoI becomes 0. Consequently,

P [v_{i}^{'} | v_{i}, e_{i}, a_{i}]

can be described as

\begin{matrix} P [v_{i}^{'} | 0 \leq v_{i} < V_{m}, e_{i} = 1, a_{i} = 1] = \{\begin{matrix} 1, if v_{i}^{'} = v_{i} + h (i) \\ 0, otherwise, \end{matrix} \end{matrix}

(10)

\begin{matrix} P [v_{i}^{'} | v_{i} = V_{m}, e_{i} = 1] = \{\begin{matrix} 1, if v_{i}^{'} = 0 \\ 0, otherwise, \end{matrix} \end{matrix}

(11)

\begin{matrix} P [v_{i}^{'} | v_{i}, e_{i} = 0] = \{\begin{matrix} 1, if v_{i}^{'} = 0 \\ 0, otherwise, \end{matrix} \end{matrix}

(12)

and

\begin{matrix} P [v_{i}^{'} | v_{i}, e_{i} = 1, a_{i} = 0] = \{\begin{matrix} 1, if v_{i}^{'} = 0 \\ 0, otherwise \end{matrix} \end{matrix}

(13)

where

h (i)

is an increasing function (e.g., a linear increasing function) as the priority i increases.

We assume that the probability that a new actuation update with priority i occurs following a Poisson distribution with mean

λ_{i}^{U}

[26]. Consequently, the probability that the control system has a new actuation update with priority i during

τ

is

λ_{i}^{U} τ

. Therefore,

P [e_{i}^{'} | e_{i}, a_{i}]

can be described as

\begin{matrix} P [e_{i}^{'} | e_{i}, a_{i} = 0] = \{\begin{matrix} λ_{i}^{U} τ, if e_{i}^{'} = 1 \\ 1 - λ_{i}^{U} τ, if e_{i}^{'} = 0, \end{matrix} \end{matrix}

(14)

\begin{matrix} P [e_{i}^{'} | e_{i} = 0, a_{i} = 1] = \{\begin{matrix} λ_{i}^{U} τ, if e_{i}^{'} = 1 \\ 1 - λ_{i}^{U} τ, if e_{i}^{'} = 0, \end{matrix} \end{matrix}

(15)

and

\begin{matrix} P [e_{i}^{'} | e_{i} = 1, a_{i} = 1] = \{\begin{matrix} 1, if e_{i}^{'} = 1 \\ 0, if e_{i}^{'} = 0 . \end{matrix} \end{matrix}

(16)

3.4. Reward and Cost Functions

For the reward and cost functions, we consider the monetary and delivery costs as well as the current AoI. Specifically, the total reward function,

r (s, a)

, is defined as

\begin{matrix} r (s, a) = w g (s, a) - (1 - w) f (s, a) \end{matrix}

(17)

where

g (s, a)

is the reward function by means of AoI and

f (s, a)

is the cost function according to the monetary and delivery cost. Note that the delivery cost denotes the additional cost caused by the delivery, such as energy consumption or association overhead [18]. w (

0 \leq w \leq 1

) is a weight factor to balance

g (s, a)

and

f (s, a)

.

Specifically,

g (s, a)

can be obtained by

g (s, a) = \sum_{i} (- {δ_{c u r, i} (t) - τ_{t a r g e t, i}}_{+}),

(18)

where

δ_{c u r, i} (t)

is the current AoI with priority i at the current time t and

τ_{t a r g e t, i}

is the target AoI, which can be considered as a service requirement of the update with priority i. In addition,

x_{+}

means the ramp function, defined as

\begin{matrix} x_{+} = \{\begin{matrix} x, & if x \geq 0, \\ 0, & otherwise . \end{matrix} \end{matrix}

(19)

In addition,

f (s, a)

can be represented as

\begin{matrix} f (s, a) = \{\begin{matrix} C_{m} + C_{t}, & if a = 0, \\ 0, & otherwise \end{matrix} \end{matrix}

(20)

where

C_{m}

and

C_{t}

are the monetary and delivery costs when the control system delivers the actuation update. These

C_{m}

and

C_{t}

are predefined constants that allow the balancing of the monetary cost and the delivery cost within the cost function and thus define priorities.

4. QL-Based Actuation Update Algorithm

To find the optimal policy in the formulated MDP model in Section 3, this paper proposes a QL-based algorithm. QL is a typical reinforcement learning algorithm to solve sequential decision problems [27] with low computational complexity [27,28] and low memory usage [29]. QL uses a state-action value,

Q (s, a)

, with a given state s and an action a. After

Q (s, a)

is initialized to zero,

Q (s, a)

can be updated at each subsequent iteration by

Q (s, a) ⟵ Q (s, a) + α (R + γ max_{a^{'} \in A} Q (s^{'}, a^{'}) - Q (s, a))

(21)

where

α

, R, and

γ

denote the learning rate, instant reward, and discount factor, respectively. To balance between exploitation and exploration, the decaying

ϵ

-greedy approach can be used for iterative updates of

Q (s, a)

. Specifically, the agent (i.e., control system) randomly selects the action with probability

ϵ

and selects the greedy action with maximum

Q (s, a)

with probability 1

- ϵ

. In addition,

ϵ

gradually decreases during iterative updates to initially explore the environment and to finally exploit the greedy action. After

Q (s, a)

converges to the optimal, the best action for every state can be selected as

arg {max}_{a} Q (s, a)

. Detailed steps for the

Q (s, a)

update are given in Algorithm 1. As shown in Algorithm 1, if the convergence condition is satisfied (lines 9–10), we can obtain the optimal policy (line 11). Otherwise,

Q (s, a)

is iteratively updated (lines 2–8).

Note that because state and action spaces are not large in the current system model as defined in Section 2, QL is exploited to solve the formulated problem with low computational complexity [27,28] and low memory usage [29]. In our future work, when state and action spaces become larger than those in the current system model, deep reinforcement learning approaches, such as a deep deterministic policy gradient, will be considered, which have strong performance in handling larger state and action spaces [30,31].

Algorithm 1 Steps for

Q (s, a)

update

1:: Initialize parameters: $Q (s, a)$ ( $s \in S$ , $a \in A$ ), initial probability $ϵ$ , count value c, learning rate $α$ , discount factor $γ$ , episode length T.
2:: Copy current $Q (s, a)$ to $Q_{c} (s, a)$ for comparison of changes after one episode.
3:: for each episode from 1 to T do
4:: At each step of the decision epoch, observe the current state s
5:: Use $ϵ$ -greedy approach to select an action a
6:: Calculate the reward R and observe the next state $s^{'}$
7:: Update $Q (s, a)$ according to (21)
8:: end for
9:: If every element in $| Q (s, a) - Q_{c} (s, a) | \leq α$ , $c ⟵ c + 1$ .
10:: If $c = 10$ , go to the next step. Otherwise, $ϵ ⟵ max (0.01, ϵ - 0.001)$ and go to step 2.
11:: Compute optimal policy $arg {max}_{a} Q (s, a)$ .

5. Performance Analysis Results

To evaluate the performance, we conduct extensive simulations by means of a Python-based event-driven simulator, where each simulation includes 10,000 decision epochs, and the average values of 10 simulations are used for the average reward. We compare the proposed scheme (i.e., PAUS) with the following four schemes: (1) SEND, where the control system delivers the actuation update immediately when a new actuation update occurs to minimize AoI, (2) TARGET, where the control system delays the actuation update and then delivers it right before exceeding the target AoI requirement, (3) PERIOD, where the control system periodically delivers the actuation update, and (4) WAIT, where the control system waits for WiFi to make the best use of WiFi.

The default parameter settings are as follows. The average probability of disconnection and connection between the WAP and actuator are set to

0.4

and

0.2

[25], respectively. The default values of

V_{m}

and w are set to 20 and

0.7

, respectively. In addition,

h (i)

is assumed to be a linear function with a static coefficient (i.e., 1) according to i. Furthermore, we assume that there are five priorities, where one is the lowest (i.e., less critical) and five is the highest (i.e., more critical) [15]. Moreover,

τ_{t a r g e t, i}

,

λ_{i}^{U}

, and the period of PERIOD are set to 10,

0.3

, and 10 decision epochs, respectively. It is assumed that

C_{m}

and

C_{t}

to use CBS are set to 4 and 1, respectively, while those to use WAP are set to 0 and 1, respectively. For the

Q (s, a)

update, we assume that

α

,

γ

, T, and

ϵ

are set to

0.2

,

0.95

, 1000, and

0.99

, respectively. Although default parameter settings are assumed, since these parameter settings can be different between scenarios, we will provide the effect of these parameters (i.e., changes in the weight factor, actuation update arrival rates, monetary cost, and WAP connection probability) on the performances in the following analysis.

Figure 3 shows the overall performance of the accumulated reward, AoI satisfaction ratio, and total monetary cost according to the simulation time. In Figure 3a, as the simulation time increases, the accumulated rewards for all schemes decrease because AoI and the monetary cost are accumulated. Among them, the PAUS achieves the highest accumulated reward because it jointly considers AoI and monetary cost. On the other hand, WAIT has the lowest accumulated reward because it waits for WiFi, which leads to increased AoI. Meanwhile, in Figure 3b, it is found that the PAUS, SEND, and TARGET can guarantee the AoI requirement (i.e.,

100 %

satisfaction ratio), while PERIOD and WAIT cannot. This is because PERIOD and WAIT deliver the actuation update periodically and only with WiFi, respectively, without consideration of AoI. In addition, Figure 3c shows the accumulated cost among them. Among the PAUS, SEND, and TARGET, which have

100 %

satisfaction ratios, it can be noted that the PAUS has the lowest accumulated cost. This means that the PAUS can minimize the monetary cost while maintaining AoI within the required value.

Figure 4 shows the average reward and AoI satisfaction ratio according to weight factor w. In Figure 4a, as w increases, the average rewards of PERIOD and WAIT decrease because of the increasing AoI. Between them, the average reward of WAIT is higher than that of PERIOD because it tries to reduce AoI whenever WiFi is available. On the other hand, as w increases, the expected rewards of SEND and TARGET increase due to the reduced AoI. Between them, the increasing rate of SEND is higher than that of TARGET because SEND can minimize AoI with increasing w. Meanwhile, the PAUS achieves the highest average reward. This is because it can reduce monetary cost at a lower w and AoI at a higher w. In Figure 4b, the PAUS cannot guarantee the AoI requirement at a lower w compared to SEND and TARGET, which can satisfy the AoI requirement. This is because, at a lower w, the PAUS aims to reduce the monetary cost, which can increase AoI, to maximize the total reward function defined in (17). On the other hand, SEND and TARGET can achieve 100% AoI satisfaction ratios because SEND and TARGET try to deliver the actuation update immediately when a new update occurs and before exceeding the target AoI requirement, respectively. However, they need higher monetary costs, which finally reduce the average reward, as shown in Figure 4a. Consequently, for the PAUS, it is found that w needs to be set higher than

0.6

to guarantee the AoI requirement.

Figure 5 shows the average reward and AoI satisfaction ratio according to the actuation update arrival rate

λ^{U}

. In Figure 5a, as

λ^{U}

increases, the average rewards of all schemes decrease because an increasing

λ^{U}

increases the number of deliveries, which can lead to monetary costs or delayed updates. Among them, the decreasing rate of SEND and PERIOD is higher than that of others. In the case of SEND, this is because as

λ^{U}

increases, the number of updates via the CBS becomes higher, which increases the monetary cost. On the other hand, in the case of PERIOD, the periodical actuation update is still used even when

λ^{U}

increases, which results in delayed updates. Overall, the PAUS achieves the highest average reward because it aims to minimize the cost jointly considering the monetary cost and AoI. In addition, as shown in Figure 5b, even when

λ^{U}

increases, the PAUS, SEND, and TARGET can guarantee the AoI requirement. On the other hand, PERIOD and WAIT cannot guarantee the AoI requirement because PERIOD still uses the periodical actuation update and WAIT delays the actuation update and waits for WiFi irrespective of

λ^{U}

changes.

Figure 6 shows the average reward and AoI satisfaction ratio according to the monetary cost

C_{m}

. In Figure 6a, as

C_{m}

increases, the average rewards of all schemes decrease because increased

C_{m}

leads to higher monetary cost. Among them, the decreasing rate of SEND is higher than that of others because SEND immediately tries to deliver the actuation update even when only the CBS is available. On the other hand, WAIT has the lowest decreasing rate because WAIT always prefers to use a WAP. Overall, the PAUS achieves the highest average reward. This is because the PAUS can fully utilize either the CBS at a lower

C_{m}

or a WAP at a higher

C_{m}

. In Figure 6b, it is shown that the PAUS cannot guarantee the AoI requirement at a higher

C_{m}

compared to SEND and TARGET, which can satisfy the AoI requirement. This is because at a higher

C_{m}

, the PAUS aims to reduce the monetary cost, which can increase AoI to maximize the total reward function defined in (17). On the other hand, SEND and TARGET try to deliver the actuation update without consideration of the monetary cost, which finally reduces the average reward, as shown in Figure 6a. Note that, if the system operator needs to enhance the AoI satisfaction ratio, even at a higher

C_{m}

, the weight factor w in the total reward function can be adjusted.

Figure 7 shows the average reward and AoI satisfaction ratio according to the WAP connection probability

λ^{C}

. In Figure 7a, as

λ^{C}

increases, the expected rewards of all schemes increase because an increased

λ^{C}

leads to lower monetary cost. Among them, the increasing rate of WAIT is higher than that of others because the increasing

λ^{C}

results in more opportunities to deliver updates via a WAP, which can reduce AoI, as also shown in Figure 7b. Overall, as presented in Figure 7a,b, the PAUS achieves the highest average reward while guaranteeing the AoI requirement. This is because the PAUS can fully utilize either the CBS at a lower

λ^{C}

or a WAP at a higher

λ^{C}

.

Figure 8 shows the average reward and AoI satisfaction ratio according to the AoI requirement

τ_{t a r g e t}

. In Figure 8a, as

τ_{t a r g e t}

increases, the average rewards of all schemes except for SEND (i.e., PAUS, WAIT, PERIOD, and TARGET) increase because there is enough time to wait for WiFi, which can reduce the monetary cost. However, because SEND delivers actuation updates irrespective of the AoI requirement, the average reward of SEND does not change according to the AoI requirement. From Figure 8b, although the AoI satisfaction ratios of WAIT and PERIOD increase, they still cannot guarantee the AoI requirement.

6. Conclusions

This paper proposes a priority-aware actuation update scheme (PAUS), where the control system determines whether to deliver or delay the actuation update, considering monetary cost and AoI with priority. To find the optimal policy, this paper formulates an MDP model and provides a QL-based solution to maximize the average reward, which implies the balance between AoI with priority and monetary cost. The simulation results demonstrated that the PAUS outperforms the comparison schemes in terms of the expected reward. In addition, it is shown that the average reward is influenced by the weight factor, actuation update arrival rate, monetary cost, and WiFi access point connection probability. Moreover, it is also found that the PAUS can minimize the monetary cost while maintaining AoI below the required value by adjusting the weight factor. In our future work, the PAUS will be integrated into the non-public network (or private 5G) architecture to provide industrial network solutions, such as priority-based wireless control of mobile robots and automated guided vehicles.

Author Contributions

Conceptualization, Y.K. (Yeunwoong Kyung); methodology, H.K.; validation, Y.K. (Youngjun Kim); formal analysis, J.S., T.S. and Y.K. (Yeunwoong Kyung); investigation, J.S. and Y.K. (Youngjun Kim); resources, J.S., H.K. and T.S.; data curation, Y.K. (Yeunwoong Kyung) and Y.K. (Youngjun Kim); writing—original draft preparation, Y.K. (Yeunwoong Kyung); writing—review and editing, Y.K. (Yeunwoong Kyung) and Y.K. (Youngjun Kim); visualization, Y.K. (Yeunwoong Kyung); supervision, T.S., H.K. and Y.K. (Youngjun Kim); funding acquisition, Y.K. (Youngjun Kim). All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the research grant of Kongju National University in 2023 and in part by “Regional Innovation Strategy (RIS)” through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (MOE)(2021RIS-003).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Mansano, R.K.; Rodrigues, R.J.; Godoy, E.P.; Colon, D. A New Adaptive Controller in Wireless Networked Control Systems: Developing a Robust and Effective Controller for Energy Efficiency. IEEE Ind. Appl. Mag. 2019, 25, 12–22. [Google Scholar] [CrossRef]
Wang, X.; Chen, C.; He, J.; Zhu, S.; Guan, X. AoI-Aware Control and Communication Co-Design for Industrial IoT Systems. IEEE Internet Things J. 2021, 8, 8464–8473. [Google Scholar] [CrossRef]
Liu, W.; Popovski, P.; Li, Y.; Vucetic, B. Wireless Networked Control Systems With Coding-Free Data Transmission for Industrial IoT. IEEE Internet Things J. 2020, 7, 1788–1801. [Google Scholar] [CrossRef]
Wang, Y.; Wu, S.; Lei, C.; Jiao, J.; Zhang, Q. A Review on Wireless Networked Control System: The Communication Perspective. IEEE Internet Things J. 2023. [Google Scholar] [CrossRef]
Liu, X.; Xu, B.; Zheng, K.; Zheng, H. Throughput Maximization of Wireless-Powered Communication Network With Mobile Access Points. IEEE Trans. Wirel. Commun. 2023, 22, 4401–4415. [Google Scholar] [CrossRef]
Zheng, M.; Zhang, L.; Liang, W. Control-Aware Resource Scheduling Method for Wireless Networked Control Systems. IEEE Sens. J. 2023, 23, 21946–21955. [Google Scholar] [CrossRef]
Zheng, K.; Luo, R.; Wang, Z.; Liu, X.; Yao, Y. Short-Term and Long-Term Throughput Maximization in Mobile Wireless-Powered Internet of Things. IEEE Internet Things J. 2023. [Google Scholar] [CrossRef]
Cetinkaya, A.; Ishii, H.; Hayakawa, T. Effects of Jamming Attacks on Wireless Networked Control Systems Under Disturbance. IEEE Trans. Autom. Control 2023, 68, 1223–1230. [Google Scholar] [CrossRef]
Sun, Y.; Uysal-Biyikoglu, E.; Yates, R.D.; Koksal, C.E.; Shroff, N.B. Update or Wait: How to Keep Your Data Fresh. IEEE Trans. Inf. Theory 2017, 63, 7492–7508. [Google Scholar] [CrossRef]
Kaul, S.; Yates, R.; Gruteser, M. Real-time status: How often should one update? In Proceedings of the 2012 Proceedings IEEE INFOCOM, Orlando, FL, USA, 25–30 March 2012; pp. 2731–2735. [Google Scholar] [CrossRef]
Champati, J.P.; Al-Zubaidy, H.; Gross, J. Statistical Guarantee Optimization for AoI in Single-Hop and Two-Hop FCFS Systems With Periodic Arrivals. IEEE Trans. Commun. 2021, 69, 365–381. [Google Scholar] [CrossRef]
Chang, B.; Li, L.; Zhao, G.; Meng, Z.; Imran, M.A.; Chen, Z. Age of Information for Actuation Update in Real-Time Wireless Control Systems. In Proceedings of the IEEE INFOCOM 2020—IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Toronto, ON, Canada, 6–9 July 2020; pp. 26–30. [Google Scholar] [CrossRef]
Xu, C.; Xu, Q.; Wang, J.; Wu, K.; Lu, K.; Qiao, C. AoI-centric Task Scheduling for Autonomous Driving Systems. In Proceedings of the IEEE INFOCOM 2022—IEEE Conference on Computer Communications, London, UK, 2–5 May 2022; pp. 1019–1028. [Google Scholar] [CrossRef]
Hazra, A.; Donta, P.K.; Amgoth, T.; Dustdar, S. Cooperative Transmission Scheduling and Computation Offloading With Collaboration of Fog and Cloud for Industrial IoT Applications. IEEE Internet Things J. 2023, 10, 3944–3953. [Google Scholar] [CrossRef]
Chi, H.R.; Wu, C.K.; Huang, N.F.; Tsang, K.F.; Radwan, A. A Survey of Network Automation for Industrial Internet-of-Things Toward Industry 5.0. IEEE Trans. Ind. Inform. 2023, 19, 2065–2077. [Google Scholar] [CrossRef]
Cogalan, T.; Camps-Mur, D.; Gutiérrez, J.; Videv, S.; Sark, V.; Prados-Garzon, J.; Ordonez-Lucena, J.; Khalili, H.; Cañellas, F.; Fernández-Fernández, A.; et al. 5G-CLARITY: 5G-Advanced Private Networks Integrating 5GNR, WiFi, and LiFi. IEEE Commun. Mag. 2022, 60, 73–79. [Google Scholar] [CrossRef]
Hewa, T.; Braeken, A.; Liyanage, M.; Ylianttila, M. Fog Computing and Blockchain-Based Security Service Architecture for 5G Industrial IoT-Enabled Cloud Manufacturing. IEEE Trans. Ind. Inform. 2022, 18, 7174–7185. [Google Scholar] [CrossRef]
Altman, E.; El-Azouzi, R.; Menasche, D.S.; Xu, Y. Forever Young: Aging Control For Hybrid Networks. In Proceedings of the Mobihoc ’19: Twentieth ACM International Symposium on Mobile Ad Hoc Networking and Computing, New York, NY, USA, 2–5 July 2019; pp. 91–100. [Google Scholar] [CrossRef]
Pan, J.; Bedewy, A.M.; Sun, Y.; Shroff, N.B. Age-Optimal Scheduling Over Hybrid Channels. IEEE Trans. Mob. Comput. 2023, 22, 7027–7043. [Google Scholar] [CrossRef]
Bhati, A.; Pillai, S.R.B.; Vaze, R. On the Age of Information of a Queuing System with Heterogeneous Servers. In Proceedings of the 2021 National Conference on Communications (NCC), Kanpur, India, 27–30 July 2021; pp. 1–6. [Google Scholar] [CrossRef]
Fidler, M.; Champati, J.P.; Widmer, J.; Noroozi, M. Statistical Age-of-Information Bounds for Parallel Systems: When Do Independent Channels Make a Difference? IEEE J. Sel. Areas Inf. Theory 2023, 4, 591–606. [Google Scholar] [CrossRef]
Xie, X.; Wang, H.; Liu, X. Scheduling for Minimizing the Age of Information in Multi-Sensor Multi-Server Industrial IoT Systems. IEEE Trans. Ind. Inform. 2023, 20, 573–582. [Google Scholar] [CrossRef]
Raiss-el fenni, M.; El-Azouzi, R.; Menasche, D.S.; Xu, Y. Optimal sensing policies for smartphones in hybrid networks: A POMDP approach. In Proceedings of the 6th International ICST Conference on Performance Evaluation Methodologies and Tools, Corsica, France, 9–12 October 2012; pp. 89–98. [Google Scholar] [CrossRef]
Ko, H.; Kyung, Y. Performance Analysis and Optimization of Delayed Offloading System With Opportunistic Fog Node. IEEE Trans. Veh. Technol. 2022, 71, 10203–10208. [Google Scholar] [CrossRef]
Suh, D.; Ko, H.; Pack, S. Efficiency Analysis of WiFi Offloading Techniques. IEEE Trans. Veh. Technol. 2016, 65, 3813–3817. [Google Scholar] [CrossRef]
Dong, Y.; Chen, Z.; Liu, S.; Fan, P.; Letaief, K.B. Age-Upon-Decisions Minimizing Scheduling in Internet of Things: To Be Random or To Be Deterministic? IEEE Internet Things J. 2020, 7, 1081–1097. [Google Scholar] [CrossRef]
Marí-Altozano, M.L.; Mwanje, S.S.; Ramírez, S.L.; Toril, M.; Sanneck, H.; Gijón, C. A Service-Centric Q-Learning Algorithm for Mobility Robustness Optimization in LTE. IEEE Trans. Netw. Serv. Manag. 2021, 18, 3541–3555. [Google Scholar] [CrossRef]
Wang, X.; Jin, T.; Hu, L.; Qian, Z. Energy-Efficient Power Allocation and Q-Learning-Based Relay Selection for Relay-Aided D2D Communication. IEEE Trans. Veh. Technol. 2020, 69, 6452–6462. [Google Scholar] [CrossRef]
Tubiana, D.A.; Farhat, J.; Brante, G.; Souza, R.D. Q-Learning NOMA Random Access for IoT-Satellite Terrestrial Relay Networks. IEEE Wirel. Commun. Lett. 2022, 11, 1619–1623. [Google Scholar] [CrossRef]
Zheng, K.; Jia, X.; Chi, K.; Liu, X. DDPG-Based Joint Time and Energy Management in Ambient Backscatter-Assisted Hybrid Underlay CRNs. IEEE Trans. Commun. 2023, 71, 441–456. [Google Scholar] [CrossRef]
Addad, R.A.; Dutra, D.L.C.; Taleb, T.; Flinck, H. AI-Based Network-Aware Service Function Chain Migration in 5G and Beyond Networks. IEEE Trans. Netw. Serv. Manag. 2022, 19, 472–484. [Google Scholar] [CrossRef]

Figure 1. System model.

Figure 2. Timing diagram for PAUS.

Figure 3. Overall performance of the accumulated reward, AoI satisfaction ratio, and accumulated cost according to the simulation time.

Figure 4. The average reward and AoI satisfaction ratio according to weight factor w.

Figure 5. The average reward and AoI satisfaction ratio according to the actuation update arrival rate

λ^{U}

.

Figure 5. The average reward and AoI satisfaction ratio according to the actuation update arrival rate

λ^{U}

.

Figure 6. The average reward and AoI satisfaction ratio according to the monetary cost

C_{m}

.

Figure 6. The average reward and AoI satisfaction ratio according to the monetary cost

C_{m}

.

Figure 7. The average reward and AoI satisfaction ratio according to the WAP connection probability

λ^{C}

.

Figure 7. The average reward and AoI satisfaction ratio according to the WAP connection probability

λ^{C}

.

Figure 8. The average reward and AoI satisfaction ratio according to the AoI requirement

τ_{t a r g e t}

.

Figure 8. The average reward and AoI satisfaction ratio according to the AoI requirement

τ_{t a r g e t}

.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kyung, Y.; Sung, J.; Ko, H.; Song, T.; Kim, Y. Priority-Aware Actuation Update Scheme in Heterogeneous Industrial Networks. Sensors 2024, 24, 357. https://doi.org/10.3390/s24020357

AMA Style

Kyung Y, Sung J, Ko H, Song T, Kim Y. Priority-Aware Actuation Update Scheme in Heterogeneous Industrial Networks. Sensors. 2024; 24(2):357. https://doi.org/10.3390/s24020357

Chicago/Turabian Style

Kyung, Yeunwoong, Jihoon Sung, Haneul Ko, Taewon Song, and Youngjun Kim. 2024. "Priority-Aware Actuation Update Scheme in Heterogeneous Industrial Networks" Sensors 24, no. 2: 357. https://doi.org/10.3390/s24020357

APA Style

Kyung, Y., Sung, J., Ko, H., Song, T., & Kim, Y. (2024). Priority-Aware Actuation Update Scheme in Heterogeneous Industrial Networks. Sensors, 24(2), 357. https://doi.org/10.3390/s24020357

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Priority-Aware Actuation Update Scheme in Heterogeneous Industrial Networks

Abstract

1. Introduction

2. System Model

3. Problem Formulation

3.1. State Space

3.2. Action Space

3.3. Transition Probability

3.4. Reward and Cost Functions

4. QL-Based Actuation Update Algorithm

5. Performance Analysis Results

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI