MDP-Based MAC Protocol for WBANs in Edge-Enabled eHealth Systems

: In recent years, eHealth systems based on the Internet of Things (IoT) have attracted considerable attention. The wireless body area network (WBAN) is an essential technology of eHealth systems. A major challenge in WBAN is the design of the medium access control (MAC) protocol, which plays a signiﬁcant role in avoiding collisions, enhancing the energy efﬁciency, maximizing the network life, and improving the quality of service (QoS) as well as the quality of experience (QoE). In this study, we apply the mobile edge computing (MEC) network architecture to an eHealth system and design a multi-channel MAC protocol for WBAN based on the Markov decision process (MDP). In this protocol, the channel condition and the reward value are considered. By continuously interacting with the environment, the optimal channel resource allocation strategy is generated. Simulation results indicate that the proposed WBAN MAC protocol can adaptively assign different channels to the sensor nodes for data transmission, thereby reducing the collision rate, decreasing the energy consumption, improving the channel utilization, and enhancing the system throughput and QoE.


Introduction
The development of wearable sensing devices and communication technologies has facilitated diverse applications in many fields such as healthcare.The traditional healthcare system is under severe pressure due to the growth of the aging population and the increase in chronic diseases [1].Therefore, eHealth systems based on wearable body sensing devices and the Internet of Things (IoT) are considered as promising technologies for realizing remote healthcare services [2,3].Wireless sensor networks (WSN) have evolved into a technology suited to human mobility through the development of wearable or implanted sensors.The main goal of these networks is to collect data and transfer the data to a remote server [4].This technology is referred to as the wireless body area network (WBAN), which consists of several sensors and a central base station or coordinator [5].These heterogeneous, miniature, energy-constrained sensors capture the physiological information of the human body, such as the body temperature, heartbeat, ECG, and EEG, and then send the physiological data to the central coordinator or base station [6].Besides healthcare, WBANs are employed in various fields, such as sports, entertainment, military, and emergency rescue.
Traditional eHealth systems based on WBANs generally employ the cloud-based network architecture [7].Data are gathered by the sensor node, transmitted to the coordinator, and then sent to the remote cloud data center [8].The time delay is longer than that in the local process.However, emergencies such as heart attacks require a rapid response.The data transmission delay between the user and the remote cloud server can be quite long, which may result in a late response.Recently, there has been a trend of moving computing closer to the users [9].Mobile edge computing (MEC) has emerged as a popular research topic in the field of communication and networking.In the edge-enabled network architecture, MEC servers are deployed near the nodes of WBANs [10].
The communication between sensing devices and the coordinator is supported by the IEEE 802.15.4 and IEEE 802.15.6 standard protocol stacks.The IEEE 802.15.4 standard was designed for low-rate wireless personal area networks (LR-WPANs) [11], and it can also support WBANs.The IEEE 802.15.6 standard is geared toward applications of WBANs.It defines the physical (PHY) and medium access control (MAC) layers [12].There are three types of PHY layers: human body communications (HBC), ultra-wideband (UWB), and narrowband (NB).In the MAC layer, the communication channels are separated into superframes.Beacon frames are used for synchronization in the beacon-enabled mode.
The MAC layer protocol is in charge of the time and channel access in a network.It plays an important role in the network resource allocation and influences the network performance [13].Energy efficiency is a critical metric to be considered for designing proper WBAN protocols.The sensing devices of WBANs are implanted in the human body and/or attached to the clothes (i.e., they are wearable).The batteries have limited capacity and it is difficult to change them.A well-designed MAC protocol can significantly improve the energy efficiency of WBANs.In energy-restricted networks, the MAC protocol is a critical aspect in terms of decreasing the power consumption and prolonging the network lifetime [13].The responsibility of the MAC layer is to coordinate the sensing devices when they access the WBAN channels.Therefore, energy efficiency is the main property of a proper WBAN MAC protocol.If the channel and/or time slot utilization can be enhanced, data collisions and/or beacon frame collisions can be mitigated, the idle time can be decreased, and the energy consumption of WBANs can be reduced considerably.
In addition, heterogeneous wearable body sensing devices produce different types of data packets that have various requirements in terms of the network quality of service (QoS).Some applications require a high probability of success in packet delivery (i.e., high packet delivery ratio) as well as timely packet delivery (i.e., low latency).For instance, the timedelay requirement of heart-rate information is shorter than that of the body temperature for heart disease patients.Moreover, the network QoS (time delay, delivery ratio, reliability, fairness, throughput, and so on) must be enhanced.Generally, medical and healthcare applications are mission critical, and they need reliable data delivery [14].Technologies and devices from all the protocol layers of a WBAN are responsible for reliability.In [15], transmission control protocol (TCP) is utilized.The MAC protocol for WBANs should be flexible in order to support diverse data and various applications.
Furthermore, there are contradictions among the different QoS requirements.For instance, delay-bounded and reliable data delivery may require higher energy consumption, thereby shortening the network lifetime.This problem complicates the design of the MAC protocol in terms of satisfying the QoS requirements.To satisfy the system QoS, the MAC protocol must be designed properly.In particular, the MAC protocol must achieve a trade-off between the energy efficiency and the data transmission reliability.
In this study, we apply an MEC-based network architecture to an eHealth system.We design an energy-efficient multi-channel MAC protocol based on a Markov decision process (MDP).This protocol attempts to minimize the beacon and data collisions of coordinators and sensor nodes by learning the collision patterns as well as by effectively assigning frequencies over time to ensure that neighboring nodes do not collide with one another.
The remainder of this paper is organized as follows.Section 2 reviews the related studies.Section 3 presents the MEC-based network architecture.Section 4 introduces the priority-aware MAC layer resource allocation mechanism.Section 5 discusses the performance evaluation.Finally, Section 6 concludes the paper.

Related Work
Recently, numerous MAC protocols have been proposed for WBANs [16].In this section, we review some existing MAC protocols that attempt to improve the power efficiency and system QoS of wearable sensor eHealth systems.
WBAN MAC protocols can be classified into two types: contention-based protocols and contention-free protocols.In contention-based MAC protocols, the sensor nodes access the communication channels through competition.It is not necessary to establish a fixed structure; hence, such protocols are more scalable.However, data packet collisions may occur during data transmission, which may lead to retransmission and energy wastage.By contrast, in contention-free MAC protocols, the communication resources such as time and channels are assigned to each senor node.For example, the well-known time-division multiple access (TDMA) MAC protocol divides the timeline into equal-length time slots and assigns them to the sensing devices.During the assigned time slots, each sensor node transmits the physiological data to the coordinator.Subsequently, the sensor nodes go into the sleep mode to save energy.Contention-free MAC protocols can reduce not only the probability of beacon and/or data collisions but also the energy consumption due to retransmissions.
In [17], the authors proposed the in-body sensor medium access control (i-MAC) protocol for implanted sensor communication in WBANs.They designed a superframe structure that separates emergency and regular event access.Further, they proposed a scheduled access mechanism according to the node priority.EE-DCAA [18] is a channel allocation mechanism that is compatible with the IEEE 802.15.6 WBAN standard.It exploits polling and contention access mechanisms.MG-HYMAC [19] is a hybrid MAC for WBANs.It integrates a transmission scheduling scheme into duty cycle operations in order to reduce the system energy consumption and extend the network lifetime.
DeepBAN [20] is a communication framework with channel prediction using a temporal convolution network (TCN) based deep learning approach.To enhance the energy efficiency of the system, the authors proposed a joint algorithm including time slot allocation, power control, and relay node selection.In the previous study [21], a WBAN power control scheme based on reinforcement learning was proposed.The end-edge-cloud orchestrated network architecture is adopted.The transmission power level is modified according to the current channel condition to reduce energy consumption.In [22], a WBAN MAC protocol based on the priority ladders resource scheduling (PLRS) scheme was used to degrade the difference of the time slot numbers among disparate sensor nodes with similar priorities.EEEA-MAC [23] is a hybrid CSMA/CA-TDMA pattern MAC for WBANs.The authors of [24] extended the superframe structure in the IEEE 802.15.6 standard.In the dynamic superframe protocol, the time slots dedicated based on priorities are assigned by the criteria importance through inter-criteria correlation (CRITIC).
In [25], the authors proposed a bargaining-based optimal slot sharing (BOSS) mechanism to solve the optimization problem of time slot assignment for sensor nodes.The communication among the sensing devices follows the IEEE 802.15.6 standard.A cooperative game theoretic method was designed to distribute the time slots to the sensors.It employs the Nash bargaining solution (NBS).In [26], the authors used the graph coloring problem to formulate the channel assignment problem.To solve the problem, the game-theoretic approach was adopted.A distributed two-hop incomplete coloring (DTIC) mechanism was proposed.In this scheme, the two-hop information is used to reuse channels.If the color number is inadequate to color all the sensing nodes, the algorithm addresses the incomplete graph coloring problem.In addition, the authors proposed a distributed message-passing scheme to avoid collisions during data transmission as well as to share coloring information among sensor nodes.
In [27], the authors used cellular-assisted device-to-device (D2D) communications for WBANs.A radio resource allocation mechanism using the Hungarian algorithm and convex optimization was proposed.The algorithm aims to optimize the system throughput, while mitigating collisions among WBANs and meeting the QoS requirements.In [28], the authors proposed a programmable MAC scheme that adds a command to the beacon of the IEEE 802.15.6 standard beacon frame.At the start of every superframe, the coordinator broadcasts the beacon to the sensor nodes.In [29], the authors proposed a greedy algorithm to solve the time slot allocation problem in WBANs.The priority of the sensor node is based on the degree of importance, data rate, remaining energy, and overtime.In [30], the authors adopted the IEEE 802.15.4 standard for body sensor networks, in which user behaviors are detected autonomously.They proposed a cross-layer time-slotted channel hopping (TSCH) scheme.
In [31], the authors presented an optimal MAC (OMAC) protocol, which employs the multi-dimensional (MD) graph optimal mechanism.This protocol attempts to achieve a compromise between the QoS and the energy consumption of data transmission.Before data transmission, the nodes wait for a period of time for more data.The length of the waiting time is determined by the application.Aggregation is used to reduce the size of the data.In [32], the authors proposed a radio resource allocation method based on the marginal utility theory.The allocation problem was modeled as a sum-utility maximization problem to attain the throughput of the required network QoS, considering the efficiency and fairness among the sensor nodes.In [33], the authors proposed a numerical MAC optimization algorithm for WBANs based on the IEEE 802.15.4 standard beacon-enabled mode.The algorithm aims to satisfy the existing traffic and QoS requirements, as well as to optimize the network delay, reliability, and energy.
The aforementioned studies proposed various WBAN access control protocols.However, these methods focus on the time division multiple access (TDMA) mechanism.There are few access protocols based on frequency division multiple access (FDMA).The channel utilization can be improved.In addition, the superframe and time slot length in the existing protocols are mostly fixed mechanisms.As the application scenarios of WBANs are heterogeneous, multi-source data have diverse requirements in terms of the QoS and quality of experience (QoE).The network performance of the MAC protocol must be improved.

MEC-Based Architecture for eHealth Systems
The traditional healthcare system network is based on the cloud architecture.Figure 1 shows an example of the network architecture of cloud-based eHealth systems.In one WBAN, the coordinator (or personal devices such as smartphones or PDAs) collects the sensing data.It also provides functions such as data storage, processing, fusion, analysis, and display to the access gateway (AG).After collecting the data, the coordinator transfers the data to the AG or access point (AP).The AG or AP may link to the Internet.The collected data can be uploaded to the remote cloud server, where data are processed, visualized, exhibited, and/or stored.In some systems, there is no AG or AP.The physiological data are transmitted to the Internet by smartphones through mobile networks.The reference nodes (RNs) are equipped with GPS or preprogrammed with their locations; they are used for sensor node localization.The medical provider, such as a physician or nurse, can access the patients' data through the Internet and thus examine, monitor, and/or provide medical services to the patients.Collaboration or consultation by experts from different locations can be carried out through the Internet.The nearest ambulance can be sent out if an emergency occurs.Furthermore, the patients' health information can be stored in the cloud.Data and statistical analysis can be performed in the long term.
However, with the rapid development of eHealth applications, the aforementioned cloud-based network architecture faces some challenges.Owing to the massive increase in the number of users and the large amount of heterogeneous data from diverse application scenarios, data transmission and processing become a heavy burden on the network and remote cloud, which have limited communication and computing resources.System QoS degradation may often occur because of data congestion.Moreover, the network QoS requirements of eHealth systems based on wearable sensors, such as the time delay, delivery ratio, reliability, and fairness, differ significantly among various applications.Some vital data, such as the heartbeats of cardiac patients, must be transmitted with low latency and high reliability.Massive data generated by wearable sensors and uploaded to the cloud will lead to unpredictable delays.In the case of an emergency, the existing system may not give timely feedback owing to the return delay of the remote server.Therefore, how to improve both the efficiency of data transmission and the QoS has emerged as a major challenge for enhancing the application scope of eHealth systems based on wearable sensing devices.However, with the rapid development of eHealth applications, the aforementioned cloud-based network architecture faces some challenges.Owing to the massive increase in the number of users and the large amount of heterogeneous data from diverse application scenarios, data transmission and processing become a heavy burden on the network and remote cloud, which have limited communication and computing resources.System QoS degradation may often occur because of data congestion.Moreover, the network QoS requirements of eHealth systems based on wearable sensors, such as the time delay, delivery ratio, reliability, and fairness, differ significantly among various applications.Some vital data, such as the heartbeats of cardiac patients, must be transmitted with low latency and high reliability.Massive data generated by wearable sensors and uploaded to the cloud will lead to unpredictable delays.In the case of an emergency, the existing system may not give timely feedback owing to the return delay of the remote server.Therefore, how to improve both the efficiency of data transmission and the QoS has emerged as a major challenge for enhancing the application scope of eHealth systems based on wearable sensing devices.
In this study, we employ the MEC network architecture to solve the aforementioned problems.MEC technology is considered as a promising paradigm for the next-generation Internet, in which the function of clouds will progressively move toward the edge of networks [34].The edge layer close to the user can effectively reduce the transmission delay, improve the rapid response ability, and significantly decrease the volume of data sent to the cloud [35].The MEC servers can respond to the crucial request promptly and then contact a nearby ambulance if required.Compared with the traditional cloud-based architecture, it has many advantages, such as shorter latency, lower energy consumption, and higher QoS. Figure 2 shows an example of the network architecture of edge-cloud eHealth systems.The system has four tiers: intra-WBAN communication, inter-WBAN communication, mobile edge communication, and remote cloud server.In this study, we employ the MEC network architecture to solve the aforementioned problems.MEC technology is considered as a promising paradigm for the next-generation Internet, in which the function of clouds will progressively move toward the edge of networks [34].The edge layer close to the user can effectively reduce the transmission delay, improve the rapid response ability, and significantly decrease the volume of data sent to the cloud [35].The MEC servers can respond to the crucial request promptly and then contact a nearby ambulance if required.Compared with the traditional cloud-based architecture, it has many advantages, such as shorter latency, lower energy consumption, and higher QoS. Figure 2   The first tier is intra-WBAN communication.In this tier, body sensing devices are placed inside, on, or around the human body to collect physiological data, such as the body temperature, heartbeat, ECG, EEG, and EMG.Activity sensors are positioned on the human body to detect the posture and movement, such as lying, walking, sitting, and running.The sensors are organized by short-distance wireless communication networks, such as Wi-Fi, Bluetooth, and ZigBee.Data are sent to the coordinator or local device, such as a smartphone.The gateway transfers the information to the next tier.
The second tier is inter-WBAN communication.This tier is the access gateway, which bridges the gap between the coordinator and the MEC servers.The coordinators of multiple users send data to the access gateway (AG) through a wireless or mobile network.Essentially, communication at this tier aims to connect the WBAN with other systems or networks.
The third tier is mobile edge communication, which is composed of MEC servers and the access gateway (AG).The coordinator transmits the data and requirements to the MEC server and AG.Edge computing aims to offload mobile devices to a nearby data process center, which can provide short-latency services.The MEC servers perform tasks in order of priority and data fusion.Then, the MEC servers relay the data to the cloud system and database.On account of data fusion, the data volume from the sensors to the cloud is reduced significantly.The utility efficiency of network communication and the computing capability can be increased by employing the edge computing architecture.Therefore, using the MEC architecture can improve the QoS and QoE of eHealth systems.
The fourth tier is remote cloud server.The data of patients are stored in the database.The doctors or expert systems examine the patients and then provide medical services.Through the Internet, experts from different locations can conduct a collaboration or consultation.If an emergency occurs, such as a heart attack, the eHealth system will send an alarm as well as the required information to the nearest ambulance [36].Furthermore, the patients' health information can be stored in the cloud.Data and statistical analysis can be conducted in the long term.

Channel Allocation Based on MDP
In WBANs, the MAC protocol allocates the channel and time slot resources to multiple sensor nodes in the same individual area network.The MAC protocol attempts to not only avoid beacon and/or data frame collisions of the sensor nodes but also improve the transmission rate and throughput of the system.Therefore, the MAC protocol plays a critical role in improving the QoS and QoE of eHealth systems.In this section, we propose a MAC protocol based on an MDP for WBANs.

Problem Formulation Based on the Markov Decision Model
Considering the network communication of WBANs, the environment is regarded as the MDP.The process can be expressed as a five-tuple (S, A, P, γ, C), where S denotes a finite state set, A denotes a finite action set, and P denotes the transition probability.We assume that at time t, the system is in state s.Further, P a (s, s ) denotes the probability of the system being in state s at time t + 1 after taking action a, γ denotes the discount factor, and C denotes the reward value, which represents the reward value obtained after the system performs action a when the system is in state s.According to the property of the MDP, the conditional probability distribution of the future state of a stochastic process is only determined by the action under the current state.
The core idea of the Markov process is to find the optimal strategy π, which is the mapping from the state set to the behavior set.To evaluate the quality of the system, it is also necessary to define a value function v, which is the expected value of the reward obtained by implementing a scheme.It is defined as follows: where E is the mathematical expectation, γ ∈ [0, 1) is the discount factor, and c t is the immediate reward value generated at time t.The aforementioned formula is expressed recursively as follows: where C(s, a) = E[c(s, a)], which gives the average value of c, and P ss (a) is the probability of transition from state s to state s , which indicates the next state.The optimal strategy π* satisfies Bellman's criterion as follows: The Q function is defined on the basis of strategy π as follows: where Q π (s, a) is the expected reward value of state s after performing action a according to the strategy. where Then, Q* can be expressed as follows: Then, the optimal value Q*(s, a) is found recursively.The initial value of the Q table is 0.Then, the criteria for Q-learning are as follows: where γ is the discount factor, which satisfies the condition 0 ≤ γ < 1, and α denotes the learning efficiency, which can be calculated as follows: where m is the access number of the current state s.
If the Q value of each accessible (s, a) pair is infinitely accessible and the learning rate α is reduced to 0 appropriately, then the Q value will converge with a probability of 1.According to the learning criterion of the Q value, the optimal channel resource allocation scheme can be obtained by constantly updating the iterative Q table.

Action Selection Based on ε Greedy Strategy
The ε-greedy strategy is adopted to avoid falling into local optima in the learning process of the algorithm.The action selection is required after the execution of state s.To maximize the reward value after each action, two aspects must be considered.The first is the reward value brought by each action, and the second is the action with the largest reward value.If the reward corresponding to each action s is a certain value C, then the action with the highest reward value can be determined by performing all the possible actions.However, the reward value C of an action is derived from a probability distribution.The average reward value cannot be accurately obtained by traversing one action.
In each attempt, the probability of ε is used for exploration, i.e., an action is randomly selected with uniform probability.The action corresponding to the maximum Q value is selected by using the probability of 1−ε.In general, ε takes a small constant value, such as 0.1.If the number of attempts is sufficient, the reward value of the action can be effectively approximated after a period of time.The unknown state is explored as much as possible.The optimal action is selected as far as possible, i.e., the optimal channel is selected.

Reward Function
The reward function is denoted by c (s, a) in Equation (8).It is defined as the reward value of performing action a in state s at a certain time.More specifically, it is the reward when the current sensor node i selects the channel k, which is related to the service priority p and data frame length d of the WBAN node.According to the IEEE 802.15.6 standard, the service scenarios corresponding to the eight priorities are basic assurance, best effort service, excellent service, visual service, audio service, medical data or network control, high priority medical data, and emergency or medical implant events [12].The higher the priority, the shorter is the time delay that can be tolerated, and the higher is the reliability requirement.
c(s, a) The following known conditions are listed: there are n sensor nodes and R available frequency resources.The channel state of sensor node i is defined as an r-dimensional vector, i.e., U(i) = {u 1 (i), u 2 (i), . . . ,u R (i)}, i ∈ {1, 2, . . . ,N}, where

Algorithm Flowchart
The objective of the algorithm is to obtain the optimal channel resource allocation mechanism to minimize the possibility of collisions when the sensor nodes transmit data.The flowchart of the MAC protocol based on the MDP is shown in Figure 3, which can also be described as follows: Step 1: The system parameters are initialized.N denotes the number of sensor nodes, R denotes the number of available frequency resources of the system, T denotes the learning time, ε denotes the probability of the greedy strategy, γ denotes the discount factor, and α denotes the learning efficiency.The initial time t is 1.
Step 2: The system model parameters are initialized.According to the number of sensor nodes and frequency resources in the WBAN, the state space S and action space A are initialized.All the Q-function values in the table are set to 0. Q (s, a) = 0, where s ∈ S, a ∈ A. Furthermore, the channel state of sensor node i is defined as an r-dimensional vector.
Step 3: The system performs actions.A random number is generated in the current state s.If the random number is greater than ε, or if all the corresponding Q values are 0, i.e., if the state is accessed for the first time, then the action is selected randomly.If the random number is less than ε, the action corresponding to the maximum value in the Q table is selected.Thus, through this action a, a channel resource is allocated to the sensor node: a ∈ {1, 2, . . . ,R}.The state at a certain time is determined by i and A(i), represented as (i, A(i)), where i denotes the current sensor node, I ∈ {1, 2, . . ., N}, and A(i) denotes the number of available channel resources of the current sensor node, A(i) ∈ {1, 2, . . ., R}.The discount factor γ satisfies 0 ≤ γ < 1. Step 4: Record the reward value.The reward value c(s, a) appraises and calculates th effect of the actions according to Equation (10).After taking action a, the reward functio value c and the next station s′ are recorded.
Step 5: Update the Q table according to the reward value c (x, k) and value Q b Equation (8).Update the value of Qt+1(s, a) accordingly.When t < T, go to step 3. When t T, abort the learning process and accomplish the channel allocation.
Step 6: The sensor node communicates with the coordinator through the allocate channel.It maintains a transmission waiting queue with data priority.
As the link station varies with the change of the network topology and the nod movement, the probability of collision during data transmission may increase, which ca lead to a decrease in the delivery ratio.If the delivery ratio is lower than the threshold, th channel resources will be reallocated.

Performance Evaluation
This section describes the computer simulation and performance analysis of the pro posed MDP-MAC protocol for WBANs introduced in the previous section.The propose algorithm is compared with the in-body sensor medium access control (i-MAC) protoco [17], the energy efficient dynamic channel allocation algorithm (EE-DCAA) [18], and th distributed two-hop incomplete coloring (DTIC) algorithm [26].The network perfo mance metrics are the system energy consumption, delivery ratio, and system through put.The quality of experience (QoE) is also defined and evaluated.
The energy consumption E is defined as the average energy consumption by eac sensor.

( )
where Et denotes the system total energy consumption and ( ) s T i denotes the ith success ful transmission.The delivery ratio Dr is defined as the percentage of the transmitted dat Step 4: Record the reward value.The reward value c(s, a) appraises and calculates the effect of the actions according to Equation (10).After taking action a, the reward function value c and the next station s are recorded.
Step 5: Update the Q table according to the reward value c (x, k) and value Q by Equation (8).Update the value of Q t+1 (s, a) accordingly.When t < T, go to step 3. When t > T, abort the learning process and accomplish the channel allocation.
Step 6: The sensor node communicates with the coordinator through the allocated channel.It maintains a transmission waiting queue with data priority.
As the link station varies with the change of the network topology and the node movement, the probability of collision during data transmission may increase, which can lead to a decrease in the delivery ratio.If the delivery ratio is lower than the threshold, the channel resources will be reallocated.

Performance Evaluation
This section describes the computer simulation and performance analysis of the proposed MDP-MAC protocol for WBANs introduced in the previous section.The proposed algorithm is compared with the in-body sensor medium access control (i-MAC) protocol [17], the energy efficient dynamic channel allocation algorithm (EE-DCAA) [18], and the distributed two-hop incomplete coloring (DTIC) algorithm [26].The network performance metrics are the system energy consumption, delivery ratio, and system throughput.The quality of experience (QoE) is also defined and evaluated.
The energy consumption E is defined as the average energy consumption by each sensor.
where E t denotes the system total energy consumption and T s (i) denotes the ith successful transmission.The delivery ratio D r is defined as the percentage of the transmitted data frames that are successfully received.
where T(i) denotes the ith transmission.The system throughput is calculated on the basis of the total successful transmitted packets and the total time T.
Besides the QoS, the quality of experience (QoE) is also an important metric to evaluate network performance.QoE is the quality perceived subjectively by the users, which can be predicted and assessed by the QoS parameters [37].In WBANs, the relationship between QoE and QoS is heterogeneous for different users.The QoE of the user is closely related to the data transmission demands.The QoE is influenced by the throughput.To evaluate the QoE precisely, it is defined as follows [38]: where Q i denotes user i's QoE, Thp i is the user i's throughput, Thp i max is the maximum throughput that user i needs, which reflects the heterogeneous transmission requirements of different users, and c i is the sensitive parameter to the throughput of user i.The system QoE, denoted by Q, is defined as the average Q i of all the users.In WBANs, the applications are extremely heterogeneous.The traffic data rates vary considerably.Applications transmitting simple data require a rate of a few kilobits per second.Video streams require a rate of several megabits per second.The transmission data rate may be significantly higher in a particular time period, which is called a burst.Table 1 summarizes the data rate requirements for some applications [1].They are computed by the expected accuracy, range, and sampling rate.Overall, the user data levels are not regarded as high.Nevertheless, if the user wears many body sensors, such as ECG, temperature, EEG, and motion sensors, the system-assembled data rate can be several megabits per second, which is higher than that of commonly used radios.The sensor nodes of WBANs can be classified according to the priority of the data.There are four categories: non-constrained traffic class (NTC), delay traffic class (DTC), reliability traffic class (RTC), and critical traffic class (CTC).The NTC sensor nodes gather non-constrained data packets (NDP), which can tolerate a certain degree of losses and have a relaxed time delay requirement, such as blood pressure (BP) and temperature.The DTC sensor nodes collect delay data packets (DDP).These packets can tolerate some losses but have time delay constraints, such as telemedicine video imaging.The RTC sensor nodes collect reliability data packets (RDP).Such data have strict requirements in terms of the packet loss ratio but no time delay constraints, such as heart rate (HR) and respiratory rate (RR).The CTC sensor nodes collect critical data packets (CDP).These data packets have strict requirements in terms of the maximum loss and time delay, such as electroencephalogram (EEG) and electrocardiograph (ECG).Table 2 summarizes the four classes and priorities [2].The path loss (PL) model of WBANs can be described as follows: where a denotes the gradient coefficient, which is experimentally measured as 1.92 (dB/cm); d denotes the distance; b is a constant, which has a value of 39.85 (dB) in real systems; and N denotes the stochastic fluctuation following a normal distribution (0, σ N ), where the experimentally estimated value of σ N is 6.59 (dB).Further, P (θ) denotes the fluctuation induced by the direction of the sensor, which can be calculated as follows: where θ denotes the angle between the transmitting and receiving antenna and x c denotes the electric field direction difference between main and cross, which has a value of 0.145 in real experiments.We simulate the performance of the proposed protocol in different scenarios with various node densities and environments.In the experiments, the sensor nodes and coordinators of the WBAN are deployed randomly in a square area of 10 × 10 m 2 .The number of sensor nodes varies from 10 to 100.The star network topology is adopted, in which the coordinator is connected to multiple wearable sensor nodes.The WBAN coordinators remain static for 10 s after the simulation starts.Then, the coordinators begin to randomly move at a speed of [0, 1.5] (m/s).The sensing device mobility velocity is a random variable that is uniformly distributed in [0, 0.5] (m/s), which emulates the walking speed of humans.The simulation program is run 20 times to generate the average results.Every cycle of the simulation is executed independently.The average values are calculated according to the results.
In one cycle, the first data frame is generated randomly.The application traffic is assumed as a constant bit rate (CBR) distribution.The packet generation rate (PGR) is set to 1-40 packets/s for different sensors.The transport layer protocol is TCP.The data transmission rate (DTR) is set to 2-300 kbps.The maximal transmission requirement of a user is set to 10-2000 kbps.The sensitive parameter to the throughput of a user varies between 1 and 5.The buffer size (BS) is 1-4 Kbytes.The superframe length is assumed as 64 slots.The duration of a time slot is set to be 5 ms (milliseconds).The sizes of MAC header, beacon, ACK, and FCS are 27, 17, 7, and 2 (bytes), respectively.The payload length of the data frame is 30-200 bytes.The channel rate is 250 kbps.The frequency band is in the range of 2400-2483.5MHz.The interference range of the sensors is assumed to be 2 m.The initial energy of the nodes is 100 J.The voltage supply is 3 V.Table 3 summarizes the energy consumption parameters employed in the simulation.Figure 4 shows the simulation results of the system energy consumption comparison of the proposed and existing protocols.The figure shows that the overall energy consumption increases continuously with the number of nodes.When the number of sensor nodes increases, the probability of allocating the idle channel decreases.The plot also clearly shows that the average energy consumption of each node of the proposed MDP-based MAC protocol is less than that of the existing MAC protocols.The energy consumption of the DTIC mechanism is lower than that of i-MAC and EE-DCAA.
Electronics 2023, 11, x FOR PEER REVIEW 13 of 17 frame is 30-200 bytes.The channel rate is 250 kbps.The frequency band is in the range of 2400-2483.5MHz.The interference range of the sensors is assumed to be 2 m.The initial energy of the nodes is 100 J.The voltage supply is 3 V.Table 3 summarizes the energy consumption parameters employed in the simulation.Figure 4 shows the simulation results of the system energy consumption comparison of the proposed and existing protocols.The figure shows that the overall energy consumption increases continuously with the number of nodes.When the number of sensor nodes increases, the probability of allocating the idle channel decreases.The plot also clearly shows that the average energy consumption of each node of the proposed MDP-based MAC protocol is less than that of the existing MAC protocols.The energy consumption of the DTIC mechanism is lower than that of i-MAC and EE-DCAA.For WBAN protocols, owing to the power limitation, energy efficiency is one of the most important metrics.The lifetime of WBANs can be extended by reducing the power consumption of the sensor nodes.The figure shows that by adopting the proposed MDPbased MAC protocol, the system energy consumption increases substantially.The edgecloud network architecture reduces the data delay, thereby reducing the energy consumption.Moreover, the MDP-based MAC protocol minimizes the energy consumption while maintaining the link quality.

Number of nodes
In the MDP-based MAC protocol, the channel utilization ratio can be decreased, which reduces not only the energy consumption of the system but also the probability of beacon and data collision.If a collision occurs, retransmission is required, which causes considerable power wastage.Therefore, the overlap area of the transmission range can be reduced.The proposed scheme can regulate the power consumption and extend the WBAN lifetime.
Figure 5 shows the WBAN delivery ratio comparison of the proposed MAC and other three protocols.Delivery ratio is another important metric for the network performance.For WBAN protocols, owing to the power limitation, energy efficiency is one of the most important metrics.The lifetime of WBANs can be extended by reducing the power consumption of the sensor nodes.The figure shows that by adopting the proposed MDP-based MAC protocol, the system energy consumption increases substantially.The edge-cloud network architecture reduces the data delay, thereby reducing the energy consumption.Moreover, the MDP-based MAC protocol minimizes the energy consumption while maintaining the link quality.
In the MDP-based MAC protocol, the channel utilization ratio can be decreased, which reduces not only the energy consumption of the system but also the probability of beacon and data collision.If a collision occurs, retransmission is required, which causes considerable power wastage.Therefore, the overlap area of the transmission range can be reduced.The proposed scheme can regulate the power consumption and extend the WBAN lifetime.
Figure 5 shows the WBAN delivery ratio comparison of the proposed MAC and other three protocols.Delivery ratio is another important metric for the network performance.As the number of users increases, the delivery ratio of the WBAN gradually decreases.The figure shows that the delivery ratio of the MDP-based MAC protocol is higher than that of i-MAC, EE-DCAA, and DTIC.The proposed protocol enhances the network performance in terms of the delivery ratio as it can reduce the beacon or data frame collisions.Meanwhile, this scheme also enhances the link reliability.Because of the lower collision rate, the successful transmission rate increases.
As the number of users increases, the delivery ratio of the WBAN gradually decreases.The figure shows that the delivery ratio of the MDP-based MAC protocol is higher than that of i-MAC, EE-DCAA, and DTIC.The proposed protocol enhances the network performance in terms of the delivery ratio as it can reduce the beacon or data frame collisions.Meanwhile, this scheme also enhances the link reliability.Because of the lower collision rate, the successful transmission rate increases.Figure 6 shows the network throughput of WBAN.The system throughput reflects the WBAN data transmission capability, which is another critical metric.As the number of nodes increases, the eHealth Internet of Things (IoT) system throughput also increases.The network throughput may be affected by a further increase in the number of sensor nodes as the probability of channel allocation decreases and the collision probability increases.Thus, the overall network throughput is affected.It is evident that the system throughput of the MDP-based MAC protocol is higher than that of i-MAC, EE-DCAA, and DTIC.The proposed protocol can reduce the beacon or data frame collisions, thereby decreasing the retransmission time.The system throughput increases as there are fewer collisions and the latency is reduced.As the retransmission rate is lower, the data delay is shorter.The channel utilization is improved, and the average time delay is reduced.Figure 6 shows the network throughput of WBAN.The system throughput reflects the WBAN data transmission capability, which is another critical metric.As the number of nodes increases, the eHealth Internet of Things (IoT) system throughput also increases.The network throughput may be affected by a further increase in the number of sensor nodes as the probability of channel allocation decreases and the collision probability increases.Thus, the overall network throughput is affected.It is evident that the system throughput of the MDP-based MAC protocol is higher than that of i-MAC, EE-DCAA, and DTIC.The proposed protocol can reduce the beacon or data frame collisions, thereby decreasing the retransmission time.The system throughput increases as there are fewer collisions and the latency is reduced.As the retransmission rate is lower, the data delay is shorter.The channel utilization is improved, and the average time delay is reduced.

Number of nodes
As the number of users increases, the delivery ratio of the WBAN gradually decreases.The figure shows that the delivery ratio of the MDP-based MAC protocol is higher than that of i-MAC, EE-DCAA, and DTIC.The proposed protocol enhances the network performance in terms of the delivery ratio as it can reduce the beacon or data frame collisions.Meanwhile, this scheme also enhances the link reliability.Because of the lower collision rate, the successful transmission rate increases.Figure 6 shows the network throughput of WBAN.The system throughput reflects the WBAN data transmission capability, which is another critical metric.As the number of nodes increases, the eHealth Internet of Things (IoT) system throughput also increases.The network throughput may be affected by a further increase in the number of sensor nodes as the probability of channel allocation decreases and the collision probability increases.Thus, the overall network throughput is affected.It is evident that the system throughput of the MDP-based MAC protocol is higher than that of i-MAC, EE-DCAA, and DTIC.The proposed protocol can reduce the beacon or data frame collisions, thereby decreasing the retransmission time.The system throughput increases as there are fewer collisions and the latency is reduced.As the retransmission rate is lower, the data delay is shorter.The channel utilization is improved, and the average time delay is reduced.Figure 7 demonstrates the simulation results of QoE of WBAN.The QoE reflects the average level of meeting the user's data transmission demands.As the number of nodes increases, the throughput of a user decreases, which causes a gradual decline in the QoE.It is evident that the QoE achieved using the MDP-based MAC protocol is higher than that achieved using i-MAC, EE-DCAA, and DTIC.In the proposed protocol, with the optimal strategy, the channel resources are allocated to minimize the probability of collision when the sensor nodes transmit data.This reduces the time wasted on retransmission.Figure 7 demonstrates the simulation results of QoE of WBAN.The QoE reflects the average level of meeting the user's data transmission demands.As the number of nodes increases, the throughput of a user decreases, which causes a gradual decline in the QoE.It is evident that the QoE achieved using the MDP-based MAC protocol is higher than that achieved using i-MAC, EE-DCAA, and DTIC.In the proposed protocol, with the optimal strategy, the channel resources are allocated to minimize the probability of collision when the sensor nodes transmit data.This reduces the time wasted on retransmission.

Conclusions
WBANs support the development of IoT eHealth systems.Because of the limited battery capacity, the energy efficiency of sensing devices is a critical issue.Moreover, owing to the mobility of sensor nodes, the rapidly changing link state, and human body shadowing, communication reliability shares a trade-off relationship with energy consumption.To achieve a shorter task process delay, this study applied an MEC-based network architecture to an eHealth system.Further, an MDP-based MAC protocol was proposed.The energy utility model and optimization problem were solved using a learning algorithm.The performance evaluation of the proposed protocol showed that it achieves a higher system energy efficiency, delivery ratio, throughput, and QoE.In the future, we will perform some experiments to validate the performance of the proposed protocol.

Conclusions
WBANs support the development of IoT eHealth systems.Because of the limited battery capacity, the energy efficiency of sensing devices is a critical issue.Moreover, owing to the mobility of sensor nodes, the rapidly changing link state, and human body shadowing, communication reliability shares a trade-off relationship with energy consumption.To achieve a shorter task process delay, this study applied an MEC-based network architecture to an eHealth system.Further, an MDP-based MAC protocol was proposed.The energy utility model and optimization problem were solved using a learning algorithm.The performance evaluation of the proposed protocol showed that it achieves a higher system energy efficiency, delivery ratio, throughput, and QoE.In the future, we will perform some experiments to validate the performance of the proposed protocol.

Figure 1 .
Figure 1.Network architecture of cloud-based eHealth systems.

17 Figure 2 .
Figure 2. Network architecture of edge-cloud eHealth systems.The first tier is intra-WBAN communication.In this tier, body sensing devices are placed inside, on, or around the human body to collect physiological data, such as the body temperature, heartbeat, ECG, EEG, and EMG.Activity sensors are positioned on the human body to detect the posture and movement, such as lying, walking, sitting, and running.The sensors are organized by short-distance wireless communication networks,

Figure 2 .
Figure 2. Network architecture of edge-cloud eHealth systems.

Figure 3 .
Figure 3. Flowchart of the MDP-based MAC protocol.

Figure 3 .
Figure 3. Flowchart of the MDP-based MAC protocol.

Figure 4 .
Figure 4. Simulation results of energy consumption.

Figure 4 .
Figure 4. Simulation results of energy consumption.

Figure 5 .
Figure 5. Simulation results of delivery ratio.

Figure 5 .
Figure 5. Simulation results of delivery ratio.

Figure 5 .
Figure 5. Simulation results of delivery ratio.

Table 1 .
Data rate requirement of healthcare applications.

Table 2 .
Sensor classes and priorities.

Table 3 .
Parameters of energy consumption.

Table 3 .
Parameters of energy consumption.