Journal of Sensor and Actuator Networks Wireless Industrial Monitoring and Control Networks: the Journey so Far and the Road Ahead

While traditional wired communication technologies have played a crucial role in industrial monitoring and control networks over the past few decades, they are increasingly proving to be inadequate to meet the highly dynamic and stringent demands of today's industrial applications, primarily due to the very rigid nature of wired infrastructures. Wireless technology, however, through its increased pervasiveness, has the potential to revolutionize the industry, not only by mitigating the problems faced by wired solutions, but also by introducing a completely new class of applications. While present day wireless technologies made some preliminary inroads in the monitoring domain, they still have severe limitations especially when real-time, reliable distributed control operations are concerned. This article provides the reader with an overview of existing wireless technologies commonly used in the monitoring and control industry. It highlights the pros and cons of each technology and assesses the degree to which each technology is able to meet the stringent demands of industrial monitoring and control networks. Additionally, it summarizes mechanisms proposed by academia, especially serving critical applications by addressing the real-time and reliability requirements of industrial process automation. The article also describes certain key research problems from the physical layer communication for sensor networks and the wireless networking perspective that have yet to be addressed to allow the successful use of wireless technologies in industrial monitoring and control networks.


Introduction
Present-day large-scale industrial monitoring and control systems may typically consist of thousands of sensors, controllers and actuators.In order to carry out their assigned tasks, it is essential for the devices to communicate.In the past, this communication was performed over point-to-point wired systems.Such systems, however, involved a huge amount of wiring which in turn introduced a large number of physical points of failure, such as connectors and wire harnesses, resulting in a highly unreliable system.These drawbacks resulted in the replacement of point-to-point systems using industrial computer networks known as fieldbuses.Over the past few decades, the industry has developed a myriad of fieldbus protocols (e.g., Foundation Fieldbus H1, ControlNet, PROFIBUS, CAN, etc.).Compared to traditional point-to-point systems, fieldbuses allow higher reliability and visibility and also enable capabilities, such as distributed control, diagnostics, safety, and device interoperability [1].
However, industrial processes are rapidly increasing in complexity in terms of factors such as scale, quality, inter-dependencies, and time and cost constraints.For example, globalization has led to companies opening up their manufacturing plants in not just one, but multiple geographic locations.Yet, in order to maximize the utilization of these distributed resources and optimize global operation, it is essential for companies to have a detailed outlook of the various operational characteristics of every single piece of equipment within every industrial plant.This could possibly require both static and moving parts of a piece of machinery to be monitored.In other words, accurate, fine-grained, large-scale, remote monitoring is an essential requirement [2].
Similarly, the view of increasing complexity also holds when considering applications which go beyond monitoring but also require control.Control operations have traditionally been carried out at the point of sensing, but more complex applications are now requiring distributed sensing and control.For example, in order to optimize overall energy usage, an industrial plant might require several pieces of machinery located in different parts of the plant to change their operational characteristics.This would require distributed sensing, control and subsequently actuation.
While existing industrial networking technologies are sufficient for performing localized monitoring and control, the distributed nature of upcoming industrial applications requires a paradigm shift from present-day strategies.The focus needs to shift from localized operations to a distributed approach where new benefits and synergies are discovered from the interconnection and communication of individual systems.
Wireless technologies have the potential to play a key role in industrial monitoring and control systems, as they have certain key advantages over conventional wired networks.In addition to extensively reducing bulk and installation costs, the unobtrusiveness of the technology allows it to be deployed easily in areas which simply cannot be monitored using wired solutions (e.g., in moving parts) [3].Modifications of the network topology (in terms of the addition or reorganization of nodes) can also be easily performed without incurring additional costs for wiring.With increased scalability, wireless sensor networks can also run collaborative algorithms (e.g., for vibration monitoring applications) to improve the robustness of the overall system.Wireless systems also require less maintenance, since unlike their wired counterparts, they are not prone to damage due to corrosion or wear and tear.Thus, this unique combination of increased scalability and robustness through using distributed mechanisms makes wireless technologies an invaluable option for developing future industrial applications that require fine-grained, flexible, robust, low-cost and low-maintenance monitoring and control.
However, wireless strategies also introduce a set of problems that can detrimentally affect various performance metrics (e.g., reliability and real-time capability).In Section 2, this article provides the reader with an overview of existing wireless technologies commonly used in the monitoring and control industry.Section 3 highlights the pros and cons of each technology and assesses the degree to which each technology is able to meet the stringent demands of industrial monitoring and control networks.Section 4 presents mechanisms proposed by academia for addressing the real-time and reliability requirement existing in industrial process automation.In Section 5 this article presents mechanisms used by industrial technologies for addressing the requirements of industrial automation wireless networks in terms of real-time capability and reliability.In Section 6 the article goes on to describe different aspects of the physical layer that is utilized in several wireless industrial technologies.Section 7 describes key research problems from the wireless networking perspective that have yet to be addressed to allow wireless technologies to be successfully used in industrial monitoring and control applications.Finally, Section 8 concludes the paper.

Overview of Existing Wireless Standards and Protocols
This section presents an overview of the wireless technologies that have been specifically tailored for use in industrial automation.They can be categorized into two parts, the IEEE 802.15.1 and IEEE 802.15.4 [4] based standards.
Wireless Interface for Sensor and Actuators (WISA) [5] is a protocol based on the IEEE 802.15.1 standard.It has been developed by ABB and allows wireless communication between sensors and actuators.It is specifically designed to address the stringent real-time requirements of factory automation.
The WirelessHART protocol, developed by the HART Communication Foundation, uses a time-synchronized, self-organizing and self-healing mesh architecture.WirelessHART is backward compatible with the HART (Highway Addressable Remote Transducer) protocol, which is a global standard for sending and receiving digital information over analog wires between monitoring and control systems.
WIA-PA is a kind of system architecture and communication protocol of wireless networks that was first developed by the Chinese Industrial Wireless Alliance (CIWA).ISA100.11ahas been developed by the ISA100 standard committee, which is a part of the International Society of Automation (ISA).ISA100.11auses IPv6 over Low power WPAN (6LoWPAN) protocol in the network layer.The 6LoWPAN was originally targeted at IEEE 802.15.4 radio standards assuming layer-2 mesh forwarding capability.Using the 6LoWPAN protocol in the network layer in ISA100.11aallows IP-based communication over IEEE 802.15.4.ISA100.11auses a synchronized mesh protocol (based on TSMP) in the data link layer which allows peer-to-peer communication and mesh forwarding.This makes every node in the sensor network directly accessible through the Internet.WISA, WirelessHART, WIA-PA and Zigbee Pro do not have the capability to provide such access.

Critical Metrics for Industrial Monitoring and Control
This section first evaluates the existing wireless technologies based on certain metrics that are essential for large-scale industrial monitoring and control applications, such as real-time capability, scalability, power consumption and robustness.

Real Time Capability
Based on the criticality and importance of the applications, the International Society of Automation (ISA) considers six classes of wireless communication, from critical control to monitoring applications, in which the importance of the message response time and Quality of Service (QoS) requirements varies [12].In the more critical applications, process values need to be transmitted to the destination in a reliable, timely and accurate manner.The details of the classes are shown in Table 2.While ISA100.11asupports industrial applications from class 1 to 5, WirelessHART supports industrial applications ranging from class 2 to 5 [12].ZigBee Pro is designed for applications which have softer real-time requirements [13].Traditional wireless sensor networks (WSNs) are deployed in class 4-5 applications [12], where low-power consumption is given priority over providing a bounded response time delay.Such WSNs are not suitable for controlling tight control loops as nodes usually spend a large proportion of the time in a low-power sleep state.
WISA is the only wireless protocol that is suitable for factory automation applications as it can provide some strict real-time guarantees.There are related basic wireless requirements in such applications, for example, low additional latency due to wireless link (e.g., <10 ms).
We carry out a more detailed analysis of the real-time capabilities of ZigBee Pro, WirelessHART, WIA-PA, WISA and ISA100.11alater in the paper by discussing specific details relating to the MAC layer contention mechanism and priority management schemes.

Scalability
As industrial processes increase in complexity, the number of points that need to be monitored and controlled increases rapidly.This makes it essential to design network architectures which are capable of scaling up.In other words, the objective is to ensure optimal network performance even when the network size or rate of data generation increases.
Current wireless technologies designed specifically for industrial applications such as WirelessHART, WIA-PA (in centralized management scheme) and ISA100.11amostly use a centralized approach for managing resources.While centralized approaches are technically easier to develop and manage, they are unable to cope with sudden changes that might occur frequently in a harsh industrial environment.This problem is further exacerbated as the network is scaled up.For example, a motor capable of running at different speeds may cause radio interference at different frequencies as it changes its operational speed.Wireless nodes operating in the vicinity of the motor should ideally reorganize their communication protocols using distributed techniques as and when interference is detected to quickly adapt to the changing environment.Traditional centralized approaches are unable to cope with such sudden unexpected changes, as they would then require detailed network statistics to be sent back to the central system manager which would then clog up the limited network resources.Thus, the larger the scale of the deployment, the more important it is to utilize distributed approaches to ensure that the system continues to perform optimally.

Power Consumption
Unlike traditional wireless sensor networks, power consumption has a lower priority than other performance metrics, such as reliability and real-time capability in industrial sensor networks.However, the degree of importance of power consumption varies greatly depending on the class of application.Industrial control applications can be categorized into two main classes: (i) process control, and (ii) factory automation.
Process control is typically used for monitoring fluids (e.g., oil level in a tank, pressure of a gas, etc.).Such applications which typically involve non-critical applications requiring closed-loop control usually transmit process values at regular intervals.Furthermore, due to the non-critical nature of the process control applications, latency requirements are not usually stringent (>100 ms).This allows nodes to reduce power consumption by carrying out aggressive duty cycling of their radios and sensor sampling operations.Factory automation applications, however, involve machines (e.g., robots) that perform discrete actions and are highly sensitive to message delays.Thus, such applications generate 'bursty' data and may require latency in the region of 2-50 ms.In such instances, reducing power consumption has a lower priority than other performance metrics such as real-time capability and reliability.
In terms of energy consumption, ZigBee Pro and WIA-PA (in the cluster/star level) do not perform as well as the other competing technologies as it carries out time synchronization using the beacon-enabled mode of IEEE 802.15.4.Using beacons introduces a large overhead in terms of higher energy consumption, as the radio needs to remain in listen mode for long periods.Conversely, TSMP solves the time synchronization problem in a more energy-efficient manner by only relying on ACKs to exchange timing offset information.WirelessHART, WIA-PA (in the mesh level) and ISA100.11aalso benefit from this approach as they both utilize TSMP.Furthermore, the ISA100.11aspecification allows the transmission power of individual nodes to be controlled.This can result in additional energy savings.However, the specifications do not describe any algorithms indicating the strategies to be followed to carry out adaptive power control.

Reliability
Reliability is an integral part of any industrial monitoring and control system as any slight degradation in communication can potentially result in complete system malfunction.In order to ensure reliable wireless communication, various techniques can be used to mitigate communication problems such as interference and weak signals.Figure 1 gives an overview of the different classes of problems in wireless communication commonly present in industrial environments and their relevant solutions.We present some of the more important solutions developed both in academia and industry in greater detail in the following sections.The PHY aspects of reliability are addressed in Section 6.

Figure 1. Common wireless communication problems and relevant solutions in typical industrial environments.
Wireless Problems

Internal interference (such as collisions or hidden terminal problems)
External interference (such as WiFi networks and high-power interference sources)

Self interference (multipath fading)
Signal attenuation (e.g.due to physical obstructions)

Techniques Used in Academia to Improve Performance Metrics
It is important to reiterate that reliability and real-time capability have higher priorities than other performance metrics in industrial sensor networks.In this section, we highlight some of the mechanisms introduced in academia to address the reliability and real-time requirements of WSNs applications.

Mechanisms to Improve Reliability
Reliability is an integral part of any industrial monitoring and control system as any slight degradation in communication can potentially result in complete system malfunction.The industrial environment may suffer from noise, internal interferences, multipath propagation, external interferences, collision, and physical obstruction.In order to ensure reliable wireless communication, various techniques are introduced to mitigate communication problems such as interference and weak signals.We discuss the mechanisms presented in academia for addressing the reliability requirement in an industrial automation environment, such as diversity, error control schemes, multipath routing, and network coding.

Diversity
When the wireless channel is in a deep fade, any communication scheme will suffer from error.In case of narrowband fading, by passing the information symbols through multiple signal paths, multiple independently faded replicas of data symbols are received at the receiver.This is a potential way to improve the performance of the communication over fading channels, provided one of the signal paths is strong [14].This mechanism is called diversity.The common diversity methods that are widely utilized in wireless communication systems are spatial diversity (or antenna diversity), cooperative diversity, temporal diversity, and frequency diversity.

Spatial Diversity and Cooperative Diversity
Antenna diversity can be achieved by equipping the transmitter and/or receiver with multiple antennas.If the antennas are sufficiently apart from each other, and the local scattering environment is rich enough, then each antenna will see independent fading.References [15] and [16] use spatial diversity to improve the reliability of the link.In [17] the authors study the antenna switching in a WSN setting in a laboratory.They analyze their results for two types of data links i.e., with and without feedback channel.In the former scenario, the transmitter switches the antenna if it does not receive any acknowledgment, and in the latter scenario, the transmitter sequentially sends multiple copies of the same packet on a different antenna.References [18] and [19] explore the use of directional antennas in WSNs and show their potential to provide increased throughput and reduced latency.
However, deploying several antennas at each wireless sensor node is difficult due to constraints in terms of analog device power consumption, space limitation and simplicity in implementation.The collaborative (or cooperative) diversity proposed by [20,21], creates a virtual antenna array by enabling the single-antenna network nodes to share their antenna through relaying.In this approach, nodes help each other to relay information in order to realize spatial diversity advantages.Significant performance gains in terms of link reliability and diversity gain can be achieved by utilizing collaborative diversity.In [22][23][24][25] authors study the cooperative communications in resource-constrained wireless networks and wireless sensor networks.
WSNs require a distributed cooperative protocol, in which users independently decide with whom to cooperate at any given time.In distributed cooperative diversity, the challenge is how to establish ways in which the scheme can treat the users fairly, does not have too much overhead, and will be compatible with the system's multiple access protocols [22].Laneman et al. explore a distributed partner assignment algorithm by applying the cooperative protocol when the channel has a high signal-to-noise ratio (SNR).Otherwise the network reverts to the non-cooperative mode [20].
Receiver diversity in the form of route choices at each hop can also be considered as a type of spatial diversity in case of multipath routing scheme, which is discussed in Section 4.1.3.

Temporal Diversity
Time diversity can be achieved via (1) channel coding in conjunction with time interleaving or (2) simply by transmitting the packet several times.In the former scheme, the information is coded first and then dispersed in different coherent periods over time [14].In the latter scheme, the messages are transmitted over different time slots with time separation.
In channel coding with time interleaving, the coherent period should be larger than the coherence time of the channel in order to experience independent fading in transmission of different parts of the code words.The interleaving fails if the fading is very slow.
In the retransmission scheme, if the time between consecutive transmissions is larger than the coherence time, then each transmission will see independent fading.Furthermore, if there is an end-to-end delay constraint or a large channel coherence time, the retransmission scheme fails to work.
In [17], the author studies time diversity in conjunction with antenna diversity, based on CC2431 radio modules.

Frequency Diversity
Diversity can be used over frequency, in case of a frequency-selective channel.Transmitting the same information on two frequency channels that are separated by more than the coherence bandwidth value (i.e., the frequency-shift by which a link has to undergo transition from deep fade) leads to independent fading.
In the physical layer, the frequency hopping spread spectrum (FHSS) mechanism is an example of frequency/spectral diversity, in which multiple channels are used to communicate.Modulation techniques, such as the direct-sequence spread spectrum (DSSS), minimize the effect of noise on a given channel.
In the MAC layer, frequency hopping is a diversity technique, in which each node-to-node transmission occurs on a different frequency than the previous one.The frequency-hopping scheme is studied in [26][27][28][29][30][31].In [26], authors show that the multipath effects are handled by channel hopping, while Reference [30] argues that the adaptive routing provides sufficient results to mitigate multipath fading.Furthermore, Reference [31] evaluates the impact of channel hopping and adaptive routing on the delay and reliability.

Error Control Mechanisms
The three main error control schemes employed by WSNs to combat the unreliability of the wireless channel are Forward Error Correction (FEC), Automatic Repeat Request (ARQ) and Hybrid ARQ (HARQ) [32].
FEC is accomplished by adding redundancy to the transmitted information so that the receiver can retrieve the information if a limited number of bits are corrupted.FEC schemes improve error resiliency at the cost of communication overhead by exchanging more information as well as decoding packet energy.Decoding energy is one of the main concerns when applying the FEC scheme in WSNs, where low clock-rate CPUs are used.
ARQ schemes depend on the retransmission of failed packets.In case of errors, ARQ incurs significant communication overhead as well as additional latency.ARQ is, therefore, infeasible for applications with real-time requirements.However, in the case of good channel quality, ARQ overhead is low compared to FEC.ARQ can be considered as a temporal diversity technique.
Hybrid ARQ schemes combine the advantages of both FEC and ARQ by incrementally using more powerful FEC codes in the transmitted packets when the packet is received in error.
In [32], it is shown that the FEC schemes can be exploited to improve network error resiliency by reducing the transmit power, which improves the network capacity by reducing the communication interference, or by constructing longer hops compared to ARQ by using the same transmit power.Reference [32] reveals that an increase in the hop-length can decrease the end-to-end delay.FEC is, therefore, more suitable to delay sensitive applications in WSNs when compared to ARQ.Similarly, in [33], a cross-layer analysis framework, which considers the impact of routing and MAC protocol, is presented.Additionally, in [33], Reed-Solomon codes have been included to exploit the benefit of FEC codes in WSNs.In [34], Biswas et al. use ARQ schemes, which provide reliability by applying both implicit and explicit acknowledgments.Reference [34] exploits an adaptive retransmission scheme, in which maximum retransmission attempts are adjusted, based on the packet error rate at each node.

Multipath Routing
Another technique to increase the reliability of the system is multipath routing.There are two interpretations of multipath routing [35].The first is to construct multiple paths for each node to reach a particular destination in the network.In this approach, multiple next hops are specified at any particular node to forward the packet.If communication between a node and its next hop is disrupted due to interference, an alternative path can be used to transport the data.In [36], Ganesan et al. define a mechanism that constructs multiple paths, whereby one is used as a primary and the second as an alternative path.
In the second interpretation, several paths are defined from source to destination and the source sends the same packet through each of them.This approach requires a large amount of network resources.In order to use the network resources more efficiently, a mechanism is introduced in [37] that splits the data packet into n subpackets, i.e., by using erasure or FEC codes, and then transmits it instead of the whole packet.In the destination, only k subnets (k < n) are required to reconstruct the original information.ReInForM [38] simultaneously sends redundant copies of a packet along multiple paths in order to increase the end-to-end probability of data delivery.The number of paths and the degree of redundancy are determined based on channel error rate and desired reliability.Reference [39] uses multiple paths in order to deliver packets to the sink for QoS provisioning, in terms of reliability and delay.

Network Coding
The broadcasting nature of wireless networks offers opportunities to address WSNs problems, such as lack of reliability and low throughput.This characteristic provides a large overlap in the information available in the network.While two nodes are communicating, the same information is being delivered to the nodes within the transmitter's communication range.This data redundancy in the network can be used to improve the robustness of the lossy wireless networks, as well as the throughput, and also to save bandwidth.In addition, application of network coding in the WSNs requires a simple, easily implemented protocol.
Robustness enhancement on data transmission can be achieved by applying network coding in wireless networks.Network error correcting codes and network erasure correcting codes are considered as two possible techniques which use spatial redundancy concepts to improve robustness [40].Network coding can help to enhance the robustness in data transfer by playing a similar role in traditional FEC, i.e., source-based coding schemes.In [41], Al-Kofahi and Kamal apply the network erasure correcting codes in wireless sensor and wireless mesh networks to provide proactive protection against link failures.In [42], Widmer et al. show that network coding performs better than probabilistic routing in terms of reliability and robustness.
Network coding can be applied to infer internal network characteristics, called network tomography.Loss tomography is referred to as the study of link loss rate inference through network coding.In [43], authors study the loss inference problem in sensor networks and present a passive loss network tomography scheme using network coding.However, in [44], authors study the identifiability problem and precisely compute the loss rate of links, rather than merely categorizing the links as either good or bad, as done in [43].

Mechanisms to Improve Latency and Real-Time Capability
In this section, we discuss MAC and routing protocols introduced in academia for addressing the real-time and latency requirements existing in WSNs applications, such as industrial process automation.

MAC Protocols to Improve Latency
Generally, mechanisms used in MAC protocols can be categorized into two classes: (i) contention-free (scheduled communication) and (ii) contention-based.The contention-free protocols are presented first, after which a discussion of contention-based protocols will follow.

Contention-Free (Scheduled Communication)
Most contention-free protocols exploit the Time Division Multiple Access (TDMA) technique.In this scheme, the neighbor nodes are scheduled (either in a centralized or distributed manner) to communicate at a predefined timeslot.This mechanism provides a guaranteed and reliable hop-to-hop data transfer.Furthermore, scheduling and reserving the timeslots along the path toward the destination provides an end-to-end real-time data transfer.
PEDAMACS [45] uses the centralized approach in scheduling the communication in the network.It supports the tree topology and provides an end-to-end bounded delay data transfer.In [46], Chintalapudi and Venkatraman present the MAC protocol that uses time and frequency multiplexing.In this protocol, the star topology is considered and a contention-free scheme is utilized to restrict the hop-to-hop delivery time.They assume that the sink is equipped with multiple transceivers, which can operate at different channels selected adaptively, while the other nodes use one transceiver.Furthermore, HyMAC [47] is designed for a network in which the nodes are equipped with multiple-channel transceivers, i.e., a different channel can be selected in each transmission.In this protocol, the combination of TDMA and FDMA features are exploited.HyMAC uses the centralized approach to schedule the communication in the network and guarantees end-to-end bounded delivery time.

Contention-Based
Most of the contention-based protocols adopt the CSMA scheme.However, this scheme is unable to guarantee the bounded delivery time, neither in the hop-to-hop nor in the end-to-end data transfer.
Several MAC protocols have been introduced that try to decrease the message delivery time in hop-to-hop communication.For example, in [48] Ye et al. present a MAC protocol based on S-MAC [49] that improves the hop-to-hop forwarding delay by applying an adaptive listening enhancement.Similarly, References [50] and [51] decrease the forwarding hop-to-hop delay, by exploiting an adaptive active period in S-MAC.In Alert [52], each node is equipped with one transceiver that can operate at different channels.Alert is designed to reduce the overall hop-to-hop delay to collect the messages from one hop neighbor in a star topology network.
Furthermore, reducing the end-to-end delay is addressed in several other MAC protocols.For instance, reference [53] presents a new contention-based MAC protocol that notifies the nodes on a multi-hop path to the sink of data delivery in progress.This protocol reduces the end-to-end forwarding delay in the network by coordinating sleep schedules.Similarly, in [54], Vasanthi and Annadurai present Q-MAC that provides a minimum end-to-end latency by alerting the intermediate nodes in advance using dynamic schedule.In [55], Yang and Vaidya present a wakeup scheme that helps to improve the end-to-end delay.They assume the presence of two transceivers in each node, one to transmit a wakeup signal and the other to transmit data packets.

Routing Protocols to Improve Latency
The previous section listed a number of MAC layer mechanisms that are utilized to improve the latency.Furthermore, a considerable amount of work has been done in the routing layer to restrict end-to-end latency.This section highlights several routing protocols that are designed to reduce the end-to-end packet delivery time.For example, MERLIN [56] combines routing and MAC features into a single architecture.MERLIN shows that by employing multicast upstream and multicast downstream schemes for communication to and from gateways, the end-to-end delay is improved.In [57], the authors present an energy-aware QoS routing protocol, which finds a delay-constrained path that has the least possible cost based on a cost function defined for each link.SPEED [58] is a QoS routing protocol that provides soft real-time end-to-end guarantees.In SPEED, each node maintains information about its one-hop neighbors and exploits geographic forwarding to find the paths.

Mechanisms Used by Industrial Technologies to Improve Performance Metrics
This section discusses the mechanisms used by industrial technologies for addressing the requirements of industrial automation wireless networks in terms of real-time capability and reliability.The mechanisms include Media Access Control (MAC) layer contention techniques, priority management schemes, channel hopping, and multipath routing.

MAC Layer Contention Mechanism and Communication Scheduling
A MAC protocol can generally be designed to operate using two mechanisms: (i) contention-free (scheduled communication) and (ii) contention-based.Contention-free approaches, e.g., dedicated timeslot-based, are more suitable for supporting real-time communication while shared timeslots (i.e., contention-based mechanisms) favor soft real-time applications.
Contention-based communication protocols, such as CSMA, are unable to provide timing guarantees when delivering messages.They are prone to packet loss by the hidden terminal problem (internal interference).Since ZigBee Pro runs on a CSMA-based MAC protocol, it is unsuitable for applications that require reliable and timely packet delivery, although WIA-PA and ISA100.11ause a CSMA-based MAC (slow hopping) for subnet discovery and retries.These latter are capable of switching to a slotted scheme where every link is scheduled to transmit at a predefined slot (TDMA) and channel offset, thereby avoiding the issue of internal interference.This mechanism is shown in Figure 2, in which the combination of slow hopping and slotted hopping is displayed.A similar form of communication scheduling is also used in TSMP, WirelessHART, and IEEE 802.15.4e.
Both WISA, WirelessHART, and WIA-PA use TDMA-based mechanisms with exclusively dedicated timeslots which do not support any variation in traffic [13].However, ZigBee Pro, ISA100.11a, and the 802.15.4eMAC standard allow the user or centralized system manager to configure the timeslot length.This could be advantageous for coping with variable data traffic rates on the network which could be a characteristic of factory automation applications requiring real-time operations.However, as individual nodes are unable to make autonomous decisions, existing technologies are unable to provide hard real-time guarantees, especially in the presence of variable data traffic.
ISA100.11a and WirelessHART use a superframe management technique to maintain real-time communication for high traffic loads.The superframe period can determine several network performance parameters, such as packet delivery latency, energy consumption, and bandwidth utilization.ISA100.11aallows tradeoffs by enabling the system manager to determine the superframe period.A shorter-period superframe results in lower packet delivery latency and higher bandwidth utilization, but results in greater energy consumption, while a longer-period superframe has the opposite effect.ZigBee Pro and WIA-PA (in the cluster/star level) also allow the transmission of superframes with different lengths in the beacon mode.

Priority Management Schemes
Timely and reliable data transport is crucial for industrial automation applications.Communication networks are usually designed to meet such criteria by using certain Quality of Service (QoS) mechanisms.QoS mechanisms generally use two techniques to achieve their goals: (i) traffic classification and (ii) resource reservation.
The traffic classification mechanism can be used for channel access and packet delivery along the path between the endpoints, by labeling the packets with a priority value and placing them on the corresponding queue in the path.The resource reservation mechanism is used for allocating and reserving the resources along the path between two end-points for the specific traffic or class of traffic to achieve the desired QoS requirement.
Fox example, in the wired CAN protocol (a communication system for industrial and automotive applications), a MAC layer technique is used to resolve the contention between several nodes to access the channel.It involves bit-wise priority arbitration for collision resolution that relies on a node's ability to transmit and receive simultaneously.Each packet has a priority value that is used to resolve the contention among different nodes trying to access the channel.The node with a higher priority label in its data packet has a higher chance of accessing the channel.Each contender node transmits its Circles numbered 0 represent a group of devices using hopping pattern 1; its hopping pattern is : 19,12,20,24,16,23,18,25,14,21,11,15,22,17,13,26 Circles numbered 5 represent another group of devices using hopping pattern 1 with a hopping offset of 5. priority value and receives feedback from the channel simultaneously.A node realizes that it has lost the contention when it detects a higher priority bit on the channel compared to the bit transmitted by the node itself [59,60].This technique cannot be used in wireless sensor networks as they typically have half-duplex transceivers.WIA-PA uses the traffic classification method for addressing different QoS requests.They define four priority levels, based on different classes of data: command packets, process data, normal packets and alarm packets.Low priority packets are declined when the device buffers become full.WirelessHART and ISA100.11ause a combination of traffic classification and resource reservation techniques for providing different QoS requests.When a device wants to establish communication with the central system manager or another device, it sends the contract request (Service request in WirelessHART), including input parameters, such as communication service type (scheduled or unscheduled communication), destination address, traffic classification (best effort queued, real time sequential, real time buffer and network control), requested period, and committed burst for non-periodic communication, to the system manager.The system manager uses its centralized optimization algorithm to determine the required allocation of the network resources (such as graphs and links) and sends a contract response to the source after all necessary network resources have been configured and reserved along the path.However, WirelessHART and ISA100.11ado not specify the specific optimization algorithms that can be used by the system manager to allocate resources.

Channel Hopping Techniques
Channel hopping is often used to mitigate external interference and multipath fading.The proper reception of wireless signals may be prevented by other radio signals generated by the devices outside the network.This kind of interference is known as external interference.Signals in the same frequency range can be generated by Bluetooth devices, microwave ovens, other external networks (such as the IEEE 802.11 network) or many unintended sources of radio interference, such as other high-power interference sources.Channel hopping techniques are a way to mitigate external interference and multipath fading.
Figure 3 provides a classification of the different channel hopping techniques as well as the standards which use each of these techniques.
There is a tradeoff between using blind channel hopping and adaptive channel hopping (ACH).In the former, if the node switches to another congested channel or switches from a good channel to a congested one, this hopping does not help to mitigate the interference and just wastes energy [30].However, in spite of this disadvantage, blind channel hopping has less overhead as the hopping pattern is already known by the network devices.In addition, if the system manager decides to blacklist a particular channel, nodes in the network still hop to the channel, but simply remain idle in that time period.Thus the larger the number of blacklisted channels, the more time is lost by nodes idling on blacklisted channels.Blind channel hopping techniques ensure that while two communicating nodes hop in unison, neighboring node pairs never use the same frequency at the same time in order to prevent hidden terminal problems.This is shown in Figure 4.  ACH differs from blind hopping in the sense that unlike in blind hopping, nodes do not keep on changing from one operating frequency to another at regular time intervals.In other words, nodes only change their frequencies when interference is detected on the current operating channel.However, nodes need to collaborate to decide which channel to switch to and this can introduce a significant overhead since nodes need to continuously scan all channels for interference levels and also because nodes need to ensure that while communicating nodes choose the same frequency, neighboring node pairs use different channels [61].In WIA-PA in each cluster/star network, the cluster head and each

Blind channel hopping
Adaptive Channel Hopping (ACH) Used by WIA-PA in the cluster/star level (The channel will be changed on link-by-link basis when necessary)

Hopping over all available channels
Used by WirelessHART, IEEE 802.15.4e,ISA100.11a, and WIA-PA (in the mesh level between the routers ) Whitelisting (Network operation is limited to a subset of channels and a device may autonomously treat the problematic channels as idle)

Globally
Used by WirelessHART, IEEE 802.15.4e, and ISA100.11a(The network manager base on the report that it received from the network, blocks and blacklists certain channels that are not working properly at a global scope )

Locally
Used by ISA100.11a(Whitelisting on a link-by-link basis.A device may autonomously treat transmit links on problematic channels as idle) node irregularly change their channel on a link-by-link basis only when channel conditions require it to.However to the best of our knowledge, there are currently no algorithms indicating strategies to be followed for ACH approaches in the multi-hop network.Both WirelessHART and ISA100.11anetworks use blacklisting techniques to mitigate external interference or multipath fading.WISA and ZigBee Pro do not have this capability.
ZigBee Pro, however, uses "frequency agility".This mechanism is not as tolerant to fluctuating wireless conditions as WirelessHART, WIA-PA, and ISA100.11a.In this technique, the network channel manager collects interference reports from all the nodes.If external interference is detected, the network channel manager scans for a better channel and moves the entire network to a new channel.This technique requires network formation to be carried out again and thus introduces inconvenient delays.Note that ACH is clearly a better technique as it only requires nodes facing interference to make changes to their operating frequencies-it does not affect the entire network.ISA100.11atries to separate the successive channels in the hopping pattern by at least 20 MHz.That means that at least a three-channel separation exists in each consecutive hop and, in the case of retries in the next hop, that they will not encounter the same IEEE 802.11 channel.The way by which this standard can coexist with the IEEE 802.11 standard has been predefined.This hopping separation is more than the coherence bandwidth value-the frequency-shift by which a link has to undergo transition from deep fade-in the case of indoor multi-path interference [26].
WISA employs frequency-hopping sequences in which the consecutive hops are widely separated in frequency and the sub-band bandwidth is more than the typical bandwidth of the coherence bandwidth.WISA uses frame-by-frame frequency hopping in which the used radio channel for retransmission is independent of the previous one, so likelihood of a successful transmission increases.

Multipath Routing
To ensure real-time capability, WISA protocols require the network to be deployed in a star topology rather than a multi-hop meshed network that is used in ZigBee Pro, WirelessHART, WIA-PA and ISA100.11a.A disadvantage of this approach is that if communication between a node and its cluster head is disrupted due to interference, alternative routes cannot be established to transport the data.The multi-hop approach of ZigBee Pro, WirelessHART and ISA100.11aprevent such problems from occurring.Additionally, to ensure robust communication in WirelessHART and ISA100.11a, the system manager defines multiple paths for each node to reach a particular destination in the network.

Physical Layer Aspects
Although this paper does not aim to provide the reader with a detailed analysis of PHY research, we feel that without briefly discussing PHY, the survey will not be complete.Not wanting to detail all the standards (e.g., IEEE 802.15.1, IEEE 802.11) used for WSNs, we solely discuss the IEEE 802.15.4 standard in this section, since most of the popular standards, such as ZigBee Pro, WirelessHART, and ISA100.11a,are built on top of this one.The IEEE 802.15.4 standard is developed for low data rate applications.The PHY of this standard operates in different frequency bands, mostly in the 2.4 GHz band, using Orthogonal Quadrature Phase Shift Keying (O-QPSK) modulation with a data rate of 250 kbps.
No channel model has been proposed in this standard, so most of the analysis has been done with the channel model prescribed in the IEEE 802.11 standard.We can adjust this channel model to represent an industrial environment, which can be represented as: for < 8 m and (1) where, P t is the transmit power in dB; P r is the received power in dB at distance d; [dB] is a Gauss distributed random variable with a certain mean and variance generated due to the shadowing effect.In the case of a home environment, is generally considered as a zero mean random variable.However, in the industrial environment, we expect large shadowing due to the presence of heavy machinery, which typically causes a positive biased shadowing effect.The mean of the shadow fading can vary according to different industrial setups.
The authors in [62] claim that the Signal-to-Noise Ratio (SNR) should not be used as a performance matrix to calculate the reliability of IEEE 802.15.4 communication links, as the relationship between SNR and the Packet Error Rate (PER) is non-linear.A small degradation in SNR beyond the threshold can shift the PER from 0 to 1.However, SNR has a direct relationship with the bit error probability and outage probability, which are two important criteria of interest in the performance analysis of IEEE 802.15.4 PHY.These performance matrices are dependent on the channel model.As most of the sensors and actuators in an industrial environment are immobile, we consider the channel as a slow fading channel, which has almost the same performance as an Additive White Gaussian Noise (AWGN) channel.
In AWGN, the bit error probability for the QPSK modulation is defined as: , where, γ 0 is the SNR of the received signal.The outage probability, P out is the probability that the received signal's average SNR, falls below the minimum required SNR, γ 0 for the pre-defined acceptable communication performance [63].Mathematically, (4) From Equations ( 3) and (4) the required average SNR is: (5) In AWGN, the carrier-to-noise ratio of the received signal can be expressed as: where, is the SNR per bit; f b is the channel data rate (net bitrate) and B is the channel bandwidth [64].As C dB = P r and for the QPSK modulation, , Equation ( 6) can be rewritten as: , 10 By using Equation ( 7), the required received power can be calculated for a given bit error rate and outage probability, which can give us the range of communication from Equation ( 1) and (2).In wireless communication (PHY), typically a bit error rate of 10 −4 and an outage probability of 0.02 are considered for the receiver design.According to the IEEE 802.15.4 standard, these result in a short communication range of 20-50 m, with acceptable outage probability.However, if interference from the devices operating in the same frequency band is considered, an even lower communication range will result in guaranteeing the same outage.Experiments show that IEEE 802.15.4 systems can operate with almost no problems in the presence of IEEE 802.11 systems in range.However, if another IEEE 802.15.4 system exists nearby, the result can be catastrophic.Such a discussion about coexistence issues has been held in [4].

Open Research Areas
While existing wireless technologies developed for industrial applications are able to carry out monitoring tasks fairly well, significant advances are required before they can be used for reliable, real-time, distributed control operations.We now highlight some of the key areas which need to be addressed to make this a reality.

A Distributed Approach to Achieving Real-Time Operation
Large-scale, distributed, real-time control applications require data to be transmitted over long distances through a multi-hop network in a timely manner.A distributed resource reservation algorithm is needed which would allow source nodes, based on the requirements of the application and traffic characteristic, to reserve network resources for its peer communications along their paths for addressing different QoS needs.The distributed nature allows the system to adapt quickly to disturbances or changes within the network to meet timing guarantees to support real-time control operation.While such mechanisms do not exist for present day sensor nodes in a distributed manner, relevant techniques from other networking-related domains could potentially be adapted to develop solutions that are suitable for wireless sensor and actuator networks.We briefly describe some of these relevant techniques.
QoS in multi-hop networks can be supported by different mechanisms, such as circuit switching, Asynchronous Transfer Mode (ATM) networks, and internet protocols (such as Integrated Services/RSVP, Differentiated Services, MPLS and constraint-based routing).It is also supported by IEEE 802.11e [65], ISA100.11a and WirelessHART in a centralized manner.
ATM signaling protocols address certain performance issues in terms of reliability and timeliness of packet delivery that are of importance in industrial applications that require closed-loop, real-time control.The ATM protocol uses a switching technique that combines the concepts of circuit switching and packet switching.For example, similar to circuit switching, before initiating data transfer, a virtual circuit is first established between the source and destination.The protocol also includes admission control mechanisms that help determine whether the required QoS guarantees can be provided.ATM uses statistical multiplexing techniques, similar to those used in packet switching, in order to cope with variable bit rates (i.e., 'bursty' traffic).
Internet protocols are mainly designed for multimedia applications.In those protocols, some mechanisms exist that allow a data receiver to request a special end-to-end quality of service for its data flows or classes of data.RSVP signaling is used by several internet protocols, such as Integrated Services Architecture, differentiated service, and MPLS, through which the application can reserve the resource and set up the path between the source and destinations.
The IEEE 802.11e, based on traffic classification mechanisms, provides different degrees of satisfaction for the users of the service.They define different priorities through which traffic can be delivered in several access categories.This differentiation is achieved by considering different amounts of time for sensing the channel to be idle and by considering different lengths of the contention window during backoff.This implies that high-priority traffic can access the channel by shorter back-offs than low-priority traffic.In addition, the packets are labeled with priority value and introduced into the corresponding queue in the path.Admission control in this standard as an important component limits the amount of traffic admitted into a particular service class so that the network resources can be efficiently utilized.

Distributed Network Management
WirelessHART, WIA-PA (in the mesh level) and ISA100.11ause centralized network management techniques for communication scheduling and managing routes.While such an approach may be easier in terms of implementation, they have numerous disadvantages.Centralized systems often perform poorly in terms of reaction time, as all updates need to be sent first to the centralized system manager (i.e., gateway) for further processing.The gateway node then performs recalculations and disseminates updated instructions to the relevant nodes in the network.As the round-trip time for such decision-making actions can be very high (especially when network contention is high), centralized approaches are unable to cope with highly dynamic situations (e.g., 'bursty' data traffic/varying link quality, and node mobility).This problem is further exacerbated as the network is scaled up.This in turn may result in problems, including increased packet loss and delayed data delivery, which increase energy consumption.The distributed nature of a distributed approach allows the system to adapt quickly to disturbances or changes within the network in real-time.However, current wireless control technologies that use distributed approaches also perform poorly in terms of reliability, efficiency and robustness.

Adaptive Channel Hopping (ACH) Algorithms
In the previous section, it was discussed that one of the solutions to mitigate interference and multipath fading is channel hopping.In adaptive channel hopping, the channel on a link-by-link basis will be changed when interference is detected on the current operating channel.However, nodes need to collaborate to decide which channel to switch to and this can introduce a significant overhead since nodes need to continuously scan all channels for interference levels and also because nodes need to ensure that while communicating nodes choose the same frequency, neighboring node pairs use different channels.To the best of our knowledge, there are currently no algorithms indicating strategies to be followed for ACH approaches in multi-hop networks.The challenges in defining the appropriate solutions are overhead and simplicity of the algorithm.

Distributed or Centralized Radio Transmission Power Control
The transmission power used by a node can have a direct impact on the radio link quality, the level of interference and energy consumption.Ideally, all nodes should always use the least transmission power that will allow them to carry out their assigned tasks effectively.While the ISA100.11aspecification allows the transmission power of individual nodes to be controlled, it does not describe any specific algorithms to perform adaptive power control.Several autonomous power control strategies, developed specifically for WSNs, can be found in the literature [66,67].However, they all have certain drawbacks which would prevent them from being used in a harsh industrial environment.For example, while the technique presented in [66] can adapt, it is not designed to handle rapid link quality fluctuations that could be caused by moving metal objects or electromagnetic interference from motors or pumps that may be common in an industrial environment.While the authors in [67] rightly point out that interference is an issue that needs to be addressed when developing adaptive power control algorithms, the presented solution does not perform optimally as it is unable to correctly distinguish between weak signals and interference.This is an area that still requires further investigation.

Network Management Algorithms for Different Traffic Patterns
WirelessHART, WIA-PA (in the mesh level) and ISA100.11ause centralized network management techniques for communication scheduling and managing routes.However, those standards do not specify the specific optimization algorithms that can be used by the system manager to allocate resources.In [68][69][70] the authors have proposed the centralized scheduling algorithm in WirelessHART for convergecast by considering linear and tree networks models.ISA100.11astandard supports peer-to-peer communication, in addition to uplink and downlink traffics.This feature makes the communication scheduling and route managing algorithm more complicated than WirelessHART, in which the main concern is forwarding the traffic toward the gateway and vice versa.To the best of our knowledge, there are currently no algorithms indicating strategies to be followed for communication scheduling and route formation in ISA100.11a.

In-Network Data Aggregation for Control Operations
Two types of aggregation, data aggregation and packet aggregation, are supported by WIA-PA in order to reduce the number of packet transmissions.WirelessHART, WISA, and ZigBee Pro do not support this function.In certain industrial closed-loop control applications involving multiple sensors and an actuator, raw sensor readings are streamed from the sensors to the actuator.The actuator subsequently performs computations using the readings to carry out the relevant control operations.This traditional approach, however, is not suitable for multihop wireless sensor networks, since they have highly limited bandwidths.The idea would then be to allow an intermediate node to carry out the computations and only send the final control output to the actuator, thus saving network bandwidth.However, as every actuation operation may be dependent on a different set of sensors, the nodes need to autonomously decide which node should act as the intermediate aggregation node that will be responsible for computing the control output.This technique will also contribute towards improving real-time operation.

Applying MIMO and OFDM in Physical Layer
Over the last decade, multiple antenna and Multiple Input Multiple Output (MIMO) techniques have been widely discussed to increase the reliability and throughput of various wireless systems.As discussed in Section 4.1.1,the antenna diversity can also improve reliability by achieving multiple different realizations of channel.So, if one antenna gets an interfered/distorted signal, which is non-recoverable due to signal propagation through a deep faded channel, another antenna may receive a copy of the signal which is suitable for decoding.This can significantly improve link reliability.MIMO can increase the system throughput without increasing the bandwidth.Several antennas to transmit/receive a portion of a signal with spatial diversity make this possible.Orthogonal Frequency Division Multiplexing (OFDM) takes this to the next step by using orthogonal sub-carriers in a frequency band.OFDM not only increases the throughput of a system, but also enables the facility to use different levels of modulation for different sub-carriers, based on the channel state information (CSI).Such techniques have been applied in the recently developed IEEE 802.11n standard to increase WiFi throughput to a next level [64].However, the complexity of the receivers also increases to facilitate this technology, which makes it challenging to implement in WSN applications.Building simple MIMO transceivers with a low power consumption is still in a research phase [71].

Conclusions
Traditional wired industrial networking technologies have numerous drawbacks.They lack flexibility, face reliability issues (due to wear and tear) and are expensive to deploy and maintain.Wireless technology, however, through its increased pervasiveness, can introduce a completely new range of industrial applications as it has the potential to provide fine-grained, flexible, robust, low-cost and low-maintenance monitoring and control.While present-day wireless technologies have taken a step in the right direction, they still have severe limitations, especially when real-time, reliable distributed control operations are concerned.This article presents an overview of current wireless technologies and academic researches and their deficiencies, and describes some key research issues that still need to be addressed in order to successfully extend the use of wireless technologies to the industrial monitoring and control sector.

Figure 2 .
Figure 2. The ISA100.11a communication scheduling mechanism that alternates between a TDMA and CSMA-based scheme.

Figure 4 .
Figure 4. Two pairs of nodes using different communication channels.

Table 2 .
Different classes of applications as defined by ISA.