Impact of Quality of Service on Cloud Based Industrial IoT Applications with OPC UA

: The Industrial Internet of Things (IIoT) is becoming a reality thanks to Industry 4.0, which requires the Internet connection of as many industrial devices as possible. The sharing and storing of a huge amount of data in the Cloud allows the implementation of new analysis algorithms and the delivery of new “services” with added value. From an economical point of view, several factors can decide the success of Industry 4.0 new services but, among others, the “short latency” can be one of the most interesting, especially in the industrial market that is used to the “real-time” concept. For these reasons, this work proposes an experimental methodology to investigate the impact of quality of service parameters on the communication delay from the production line to the Cloud and vice versa, when gateways with OPC UA (Open Platform Communications Uniﬁed Architecture) are used for accessing data directly in the production line. In this work, the feasibility of the proposed test methodology has been demonstrated by means of a use case with a Siemens S7 1500 controller exchanging data with the IBM Bluemix platform. The experimental results show that, thanks to the proposed method, the solutions based on OPC UA for the implementation of industrial IoT gateways can be easily evaluated, compared and optimized. For instance, during the 14-day observation period of the considered use case, the great impact on performance of the Quality of Service parameters emerged. Indeed, the average communication delay from the production line to the Cloud may vary from less than 90 ms to about 300 ms.


Introduction
Nowadays, the industrial world is using many technologies created for the consumer market and Internet: low-cost sensors, advanced computing and analytics [1].The surprising level of connectivity supports the so called fourth industrial revolution, promising greater speed and increased efficiency.New embedded sensors/instruments with connectivity features move data from the production site (i.e., the machines) to the Cloud.The collection of data takes place from every production site in the world and, then, the giant quantity of gathered data is analysed.In this way, new services, derived from the analysis of the data, are offered with the intention of improving both the general system and the single machine performance [2].Internet of Things (IoT) and Industrial Internet of Things (IIoT) are the keywords that label this ongoing evolution process in industrial automation [3].The concept of Industry 4.0 (which has been addressed as the "fourth industrial (revolution) is involved as well but the terms are often misused and additional remarks are reported in the next section.
Even if there is not a clear distinction between IoT and IIoT (because, formally IIoT is a subset of the IoT), what is usually named as IoT can be considered the "consumer IoT", as opposed to "Industrial IoT".Consumer IoT is mainly centred around the human beings; indeed the "things" typically are smart appliances interconnected with one another in order to provide improved user awareness of the surrounding environment.On the other side, it is usually said that the aim of IIoT is to integrate Operational Technology (OT) with Information Technology (IT) domains.Differently from consumer IoT, IIoT communications are mainly machine oriented and can involve very different market sectors and activities.As a consequence, despite the most general communication requirements of IoT and IIoT are aimed at large scale connectivity, the specific needs can be very different.For instance, industrial scenarios pay attention to the Quality of Service (QoS, e.g., in terms of determinism and communication delays), the availability and the reliability.Very generally and roughly speaking, it is possible to resume that the IIoT originates when the IoT approach crosses the manufacturing stage of the production cycle.
Unfortunately, the ubiquity of Internet is only one of the aspects of the new era, not even the main one.The most researched subject is the utopic "single protocol" (i.e., accepted by any application market, industry and consumer) that could describe methods and data in a smart and flexible way.There are several examples of shared and widely used protocols in specific application markets and, probably, in industry, the most accepted protocol which harmonizes the machine to machine (M2M) interaction is OPC UA (Open Process Communications Unified Architecture).The OPC Foundation in the past had a great success with the "OPC Classic" and today it is proposing the OPC UA protocol as more powerful successor for its platform independent architecture.OPC UA makes use of most recent concepts to include key features like security, structured information model and auto discovery functions.Thanks to these important foundations, M2M can become "smart" enabling highly flexible scenario where machines could be self-organized.In contrast, the major Cloud platforms for data analytics and sales of services (e.g., Amazon S3, IBM Bluemix, Microsoft Azure, etc.) are still strongly related to the consumer market; they do not natively interface with OPC UA.Usually, gateways/proxies are required because their information model is different.Moreover, OPC UA is for industrial applications and it timestamps any data modification, while, on the contrary, the timely delivering of information among very different applications in Internet with the needed level of flexibility, scalability and geographic coverage is still not possible [4,5].Considering all these situations, the delay/latency of services may be one of the most important advantages of new services in the near future, especially in the industrial market [6] that is used to real-time.However, despite the fact that it is typically affirmed that an IIoT communication framework is "generally" designed to fulfil timing requirements and packet losses minimization (e.g., for supporting functional safety and self-healing), very few, if any, research exists on experimental verification and tests in real-world scenarios.In particular, we focus on time-related performance, which are considered the most demanding for industrial applications [7].
Accordingly, the target of this paper is to overcome this gap, proposing an experimental methodology to investigate the impact on the delays of the various Quality of Service (QoS) parameters offered by the current Cloud platform, when they are used as sink/source points of data coming from machines with native OPC UA.The paper is voluntary focused only on data transfer passing through usual gateway toward Cloud platform; time for data elaboration in the Cloud is out of the scope of this work.The structure of the paper is the following: Section 2 introduces IIoT and Industry 4.0 scenario and the considered application; Section 3 details the proposed methodology, while Section 4 shows the experimental use case, the experimental results and the discussion about more general considerations.Finally, the conclusions are presented.

The Industrial IoT
The "Industry 4.0" tries to increase efficiency in the industry through the exchange/collection of information in the course of the entire product lifecycle.This concept requires the creation of the so called "digital twin" of the product (or manufacturing process), a virtual repository where all the information about each product instance are stored.The union of physical and cyber components is called Cyber-Physical System (CPS) [8].On the other hand, the consumer market uses IoT devices (smart devices) for providing better efficiency, comfort and safety by means of data and services exchange on a common infrastructure (Internet) with standard interfaces.The IIoT defines the overlapping area where IoT approach is used inside the Industry 4.0 architecture: this circumstance occurs especially when the product is physically created (manufactured), that is in the so called "operation phase".
Industrial automation sector was always a pioneer of innovation and most of industrial systems (machines and plants) have a lot of installed smart field devices.Consequently, there are many connectivity options that can be used today but often, only data related to control are used, while the great amount of other available data is simply dropped.Industry 4.0 proposes: to collect data with consistent data models across the whole plant; to analyse that data extracting the meaningful information; and, lastly, to use services based on the revealed information.In the Industry 4.0 world the field devices and the Cloud can directly communicate.Thus, the new industrial communication hierarchy is flat: services are available for any participants of the automation system [3,9].This approach changes the situation of applications that now use only local parameters (i.e., related to just one plant or machine e.g., [10,11]), allowing for the scope widening of the input data, since now they can come from other machines/plants worldwide.In addition, the high computational power in the Cloud can be exploited to aggregate/analyse/optimize data before their use in the field.
In the near future, Industry 4.0 applications will activate an endless loop with four main blocks: (1) Device in the field send measurements to the Cloud; (2) measurements are analysed with distributed algorithms in the Cloud; (3) system in the field receives back from the Cloud the optimized parameters; (4) those parameters are used to adapt/improve performance and efficiency of the production system.

The OPC UA in Industry
Up to very few years ago, the communication systems for industrial automation applications were oriented only to real-time performance suitable for industry and maintainability based on "international standards".Among others, the most diffused industrial protocols today are, for instance; the wired based EtherNET/IP, PROFINET, Powerlink and EtherCAT; and the wireless based IEEE802.11,ISA100.11a and Wireless HART.However, it is known that interoperability between systems of different vendors with different protocols is always difficult because of the incompatible information models for data and services.Industry 4.0 manufacturing systems cannot rely only on such legacy approaches in order to reach the required flexibility level.
The most promising solution for this challenge is the OPC UA of the OPC Foundation [12,13]: OPC UA defines the mode for exchanging information between industrial engineering systems.OPC UA enhances the old OPC Classic with extended features in terms of data modelling, address space architecture, discovery functionalities and security.In OPC UA, servers contain the structured information model that represents the data and the communication model is client-server.Until today, OPC UA has been included in large number of machines and systems and it is a "de facto" reference method for process to process communication.Being more specific, the usual way to proceed is to offer OPC UA as the interface to export data from systems while, internally, the automation at field level is still implemented with traditional fieldbus networks.With OPC UA the data in the address space (described in the following) can be modelled without constrain to a specific communication protocol, obtaining information flows between heterogeneous systems with different data models.In conclusion, OPC UA is the major aspirant to be the backbone protocol for the harmonization of different industrial automation networks and systems [14].

OPC UA Outline
OPC UA has been designed to facilitate the exchange of information across the hierarchy of systems that commonly coexist in industry: enterprise resource planning (ERP); manufacturing execution systems (MES); control systems; and, last but not least, field devices.OPC UA has a message based communication and a service oriented architecture (SOA) with clients and servers connected to any types of networks.
A client application may use the OPC UA client API (application program interface) in order to send/receive OPC UA service requests/responses to/from the OPC UA server.From the programmer point of view, the OPC UA client API is like an interface that decouples the client application code from the client OPC UA communication stack.In the OPC UA API, there is a discovery service that can be used to find available OPC UA servers and to explore their address space.Clearly, the OPC UA communication stack converts the calls to the OPC UA API to proper messages for the underlying network layers.
In the servers, the OPC UA server API and the OPC UA communication stack are very like the client ones.As additional feature, the server has the so called "address space" in which it can expose the object to be exchanged.In OPC UA, a multiplicity of data structures (called "nodes") can exist, representing, for instance: variables, complex objects, methods (i.e., remotely called functions) and definitions of new types for creating new OPC UA metadata.A hierarchical structure of arbitrary complexity can be created with OPC UA since an object node may contain other variables, objects, methods and so on.In other words, the OPC UA address space is the information model for the communication: real hardware devices or real software "objects" (sensors, actuators, software applications, etc.) are available for OPC UA communication only if they are modelled, added to the address space and finally discovered by the OPC UA clients.

Quality of Service
The definition of "Quality of Service-QoS" is usually depending on the application field because the "service" may vary from case to case [15].Generally speaking, in industry, the QoS is often related to timeliness of services or to the guaranteed availability grade of a given service but it should be remembered that QoS timeliness does not necessarily imply a guaranteed delivery.Some examples are:

•
the IEC 61850, where the QoS is related to the achievable class of latency in transferring data from the different parts of an electrical Substation Automation System; • the Ethernet Time Sensitive Networking (TSN) group of standards, where the QoS is referred to the maximum allowed delay for a stream of transferred data; • the MQTT (Message Queuing Telemetry Transport), where the QoS is tied to the confirmation that a message is delivered to any subscribers (which in turn impact again on latency).
As a consequence, the definition of "quality of service parameters" is even wider, since it refers to any parameters that can affect the QoS of the desired service.
The task of measure the effect of the variation of some parameters on the QoS of a given service could be cumbersome, requiring specific experimental setup for each different application.On the contrary, in this paper a general approach, focused on measuring latency and jitter of services, is proposed for general cases and not bound to any specific industrial applications.

The Proposed Approach
In any communication systems, the delay of a data transfer is related to the path of the data.Specifically, the user cannot modify most of the parts within a Cloud based architecture, which appears as "black box".There are two methods for the estimation of the overall delay and its sub-components: simulations can be done as described in [16,17], or experiments can be setup as in [18,19].However, it has to be highlighted that simulations offer the possibility to finely control the impact of varying the quantity of interest but they can suffer from over simplified models, that do not accurately represent real-world scenarios.For this reason, in this work, a general-purpose test methodology based on experimental approach is designed and discussed that can fit several situations that are actually found in real plants.In particular, as previously stated, the focus is on the evaluation of time-related metrics.

Description of the Experimental Method
In the typical IIoT service scenario there is a machine that sends its data to the Cloud; there, data are elaborated together with other information of other machines; then, the outcome is "sold" to be used in some other field application/machine and, for this reason, it has to come back to the production field.It should be noted that, compared to IoT applications for mobile devices [20], the considered IIoT scenario is simpler.
The proposed method is tailored for the described situation and it is intended to measure the delay of the communication between: a data originator in the field (machine) and a data destination in the Cloud; and, on the reverse route, a data originator in the Cloud and a data destination in the field.The focus of this paper is to evaluate only the delay due to communication without taking into account elaboration of data; for this reason, the originator and the destination are the same machine/cloud application.Moreover, the sample data are sent with a loopback as soon as they reach the Cloud.
The block diagram of the proposed approach is shown in Figure 1.There are three players: the Machine; the IIoT Gateway; and the Cloud Application.Two OPC UA nodes A and B are created such that in their properties they include an array of timestamp values (T1 to T8).Nodes are sent from the Machine, the IIoT Gateway with OPC UA protocol and from the IIoT Gateway to the Cloud Application using one of the most diffused IoT messaging protocol (e.g., MQTT-Message Queuing Telemetry Transport).As soon as the nodes A and B pass in one of the timestamping points (labelled in the picture with the name of the corresponding timestamp), the matching timestamp value is loaded with the time of the operating system local clock.The Machine publishes data A at time T1; the IIoT Gateway gets it at time T2; closely after, at time T3 > T2, the IIoT Gateway sends A to the Cloud Application that finally receives it at time T4.Immediately, the cloud application copies back the object A to object B and it starts the reverse path sending back B at T5 to the IIoT Gateway.When B is received at T6 in the IIOT Gateway, it is routed to the Machine as a OPC UA node with departure time T7 and arrival time T8.After a complete roundtrip, the data structure contains all the timestamps (which are useful for the estimation of the delays).
The first important metric is the OPC UA end-to-end delay (OD) that can be calculated in the two directions using the timestamps T1 and T2 from Machine to Gateway and using timestamps T7 and T8 from Gateway to Machine: The second important metric is the Cloud messaging protocol end-to-end delay (MD) that is obtained for the two directions using the timestamps T3 and T4 from Gateway to Cloud and using timestamps T5 and T6 from Cloud to Gateway: The last metric is the total end-to-end communication delay ED from Machine to the Cloud and from Cloud to Machine, which is just the sum of the previously calculated partial paths: Summarizing, the proposed experimental approach allows for the characterization of the communication delay of a generic industrial IoT application that uses OPC UA to access generic Cloud services.The three metrics presented above are simultaneously available in the experiments and can be used as performance indexes for the comparison of different setups.
In particular, the effect of different QoS parameter settings on the performance indexes of the communication delay will be studied in this paper, by means of a suitable use case.
Electronics 2018, 7, x FOR PEER REVIEW 6 of 14 In particular, the effect of different QoS parameter settings on the performance indexes of the communication delay will be studied in this paper, by means of a suitable use case.

An OPCUA Use Case
In order to show the applicability of the proposed measurement methodology, a sample use case is considered.It is clear that the obtained results depend on all the blocks, components and software composing this specific use case.
The experimental setup is realistically based on commercially available components and software for industrial automation.The Machine in the field uses a Siemens S7-1516 controller with embedded OPC UA communication stack.The IIoT Gateway has been built with a Siemens IOT2040 (embedded device with Yocto Linux-kernel 2.2.1,Intel Quark X1020, 1 GB RAM, 8GB SD disk, 2 × Ethernet ports, 2 × RS232/485 interfaces, battery-backup RTC).The considered Cloud platform is the IBM Bluemix; it runs the "Internet of Things Platfom-m8" service, while it uses "Node-RED" framework for programming the data transfer/elaboration.IBM Bluemix has several access/use profiles: in this paper, the free access version is used, resulting in limited feature in terms of Cloud computational resources (which are out of the scope of this paper).However, no limitation from the communication features are mentioned in the contract.
The IOT2040 Gateway is attached to the same network of the Machine (in Milan at the Siemens Spa headquarter), hence the local area network introduces a negligible delay.Last, the Siemens network connection to the Internet has a very high quality with extended availability.As an additional remark, it has to be considered that network paths are not always guaranteed to be the same, due to the well know internet asymmetry [21]; however, the proposed methodology can identify such an asymmetry as well.

Node-RED Flows for the Experimental Setup
The experiments have been carried out using specific Node-RED flows.Node-RED is the graphic tool established by IBM for "wiring together hardware devices, APIs and online services in new and interesting ways" (from Node-RED website, https://nodered.org/).Node-RED uses the famous JavaScript runtime Node.js and by means of a browser based editor, allows for drag and drop, connection and configuration of "nodes" (i.e., open-source functions and interfaces).Thus, a program in Node-RED describes the data flow from source to destination passing through elaboration steps; for this reason, it is simply called "flow".Node-RED is an efficient option for applications that are aimed to prototype some IoT connectivity.In the specific application of the paper, Node.js has been considered as reference platform for the experimental implementation of the use case due to a tradeoff between effectiveness and cost of human resources (programmers).The impact of Node.js can be estimated around 10% of the processing power of the gateway used in the demonstration use case.

An OPCUA Use Case
In order to show the applicability of the proposed measurement methodology, a sample use case is considered.It is clear that the obtained results depend on all the blocks, components and software composing this specific use case.
The experimental setup is realistically based on commercially available components and software for industrial automation.The Machine in the field uses a Siemens S7-1516 controller with embedded OPC UA communication stack.The IIoT Gateway has been built with a Siemens IOT2040 (embedded device with Yocto Linux-kernel 2.2.1,Intel Quark X1020, 1 GB RAM, 8GB SD disk, 2 × Ethernet ports, 2 × RS232/485 interfaces, battery-backup RTC).The considered Cloud platform is the IBM Bluemix; it runs the "Internet of Things Platfom-m8" service, while it uses "Node-RED" framework for programming the data transfer/elaboration.IBM Bluemix has several access/use profiles: in this paper, the free access version is used, resulting in limited feature in terms of Cloud computational resources (which are out of the scope of this paper).However, no limitation from the communication features are mentioned in the contract.
The IOT2040 Gateway is attached to the same network of the Machine (in Milan at the Siemens Spa headquarter), hence the local area network introduces a negligible delay.Last, the Siemens network connection to the Internet has a very high quality with extended availability.As an additional remark, it has to be considered that network paths are not always guaranteed to be the same, due to the well know internet asymmetry [21]; however, the proposed methodology can identify such an asymmetry as well.

Node-RED Flows for the Experimental Setup
The experiments have been carried out using specific Node-RED flows.Node-RED is the graphic tool established by IBM for "wiring together hardware devices, APIs and online services in new and interesting ways" (from Node-RED website, https://nodered.org/).Node-RED uses the famous JavaScript runtime Node.js and by means of a browser based editor, allows for drag and drop, connection and configuration of "nodes" (i.e., open-source functions and interfaces).Thus, a program in Node-RED describes the data flow from source to destination passing through elaboration steps; for this reason, it is simply called "flow".Node-RED is an efficient option for applications that are aimed to prototype some IoT connectivity.In the specific application of the paper, Node.js has been considered as reference platform for the experimental implementation of the use case due to a trade-off between effectiveness and cost of human resources (programmers).The impact of Node.js can be estimated around 10% of the processing power of the gateway used in the demonstration use case.The number of devices connected to the gateway linearly increases the CPU and memory usage.As a consequence, Node.js with embedded devices is only recommended for small projects, with not demanding requirements, or for experimental test environment, as in the case of the current research.
In this paper, Node-RED is used both in the IIoT Gateway and in the Cloud application.The flows that implement the formulas/algorithms described in the previous section are shown in Figure 2 for IIoT Gateway along the path from Machine to Cloud, Figure 3 for IIoT Gateway along the path from Cloud to Machine and Figure 4 for the Cloud application.
It should be noted that the main limit of the OPC UA solution is currently the high computational power required by OPC UA stacks.Commonly available PLCs support a limited number of concurrent connections.In addition, the OPC UA implementation for Node RED in the use case is resource consuming also in the Gateway.Moreover, the Internet connection bandwidth could also affect the delays.Since this paper is focused on the measurement methodology and on the demonstration of the feasibility of such methodology by means of a sample use case, further investigations about different use cases are out of scope.
The flows are voluntarily kept as simple as possible to reduce computational overhead.The Gateway is forwarding the data object (containing the timestamp values as properties) to the Cloud using nodes of the OPC UA Library and IBM Bluemix Library.The properties of the data object are updated at the corresponding timestamping points.The Cloud application sends back any incoming data object after a fixed delay of 60 s.The data object is regularly sampled in the flows and saved to files for backup purpose; in case the complete path is not available, partial metrics can still be computed.
In this paper, for the configuration of the connection to the IBM Cloud, several QoS settings have been used and compared: • "quickstart" mode (unregistered user) • QoS = 0 with the "registered user" mode • QoS = 1 with the "registered user" mode.
Electronics 2018, 7, x FOR PEER REVIEW 7 of 14 a consequence, Node.js with embedded devices is only recommended for small projects, with not demanding requirements, or for experimental test environment, as in the case of the current research.
In this paper, Node-RED is used both in the IIoT Gateway and in the Cloud application.The flows that implement the formulas/algorithms described in the previous section are shown in Figure 2 for IIoT Gateway along the path from Machine to Cloud, Figure 3 for IIoT Gateway along the path from Cloud to Machine and Figure 4 for the Cloud application.
It should be noted that the main limit of the OPC UA solution is currently the high computational power required by OPC UA stacks.Commonly available PLCs support a limited number of concurrent connections.In addition, the OPC UA implementation for Node RED in the use case is resource consuming also in the Gateway.Moreover, the Internet connection bandwidth could also affect the delays.Since this paper is focused on the measurement methodology and on the demonstration of the feasibility of such methodology by means of a sample use case, further investigations about different use cases are out of scope.
The flows are voluntarily kept as simple as possible to reduce computational overhead.The Gateway is forwarding the data object (containing the timestamp values as properties) to the Cloud using nodes of the OPC UA Library and IBM Bluemix Library.The properties of the data object are updated at the corresponding timestamping points.The Cloud application sends back any incoming data object after a fixed delay of 60 s.The data object is regularly sampled in the flows and saved to files for backup purpose; in case the complete path is not available, partial metrics can still be computed.
In this paper, for the configuration of the connection to the IBM Cloud, several QoS settings have been used and compared: • "quickstart" mode (unregistered user) • QoS = 0 with the "registered user" mode • QoS = 1 with the "registered user" mode.Electronics 2018, 7, x FOR PEER REVIEW 7 of 14 a consequence, Node.js with embedded devices is only recommended for small projects, with not demanding requirements, or for experimental test environment, as in the case of the current research.
In this paper, Node-RED is used both in the IIoT Gateway and in the Cloud application.The flows that implement the formulas/algorithms described in the previous section are shown in Figure 2 for IIoT Gateway along the path from Machine to Cloud, Figure 3 for IIoT Gateway along the path from Cloud to Machine and Figure 4 for the Cloud application.
It should be noted that the main limit of the OPC UA solution is currently the high computational power required by OPC UA stacks.Commonly available PLCs support a limited number of concurrent connections.In addition, the OPC UA implementation for Node RED in the use case is resource consuming also in the Gateway.Moreover, the Internet connection bandwidth could also affect the delays.Since this paper is focused on the measurement methodology and on the demonstration of the feasibility of such methodology by means of a sample use case, further investigations about different use cases are out of scope.
The flows are voluntarily kept as simple as possible to reduce computational overhead.The Gateway is forwarding the data object (containing the timestamp values as properties) to the Cloud using nodes of the OPC UA Library and IBM Bluemix Library.The properties of the data object are updated at the corresponding timestamping points.The Cloud application sends back any incoming data object after a fixed delay of 60 s.The data object is regularly sampled in the flows and saved to files for backup purpose; in case the complete path is not available, partial metrics can still be computed.
In this paper, for the configuration of the connection to the IBM Cloud, several QoS settings have been used and compared: • "quickstart" mode (unregistered user) • QoS = 0 with the "registered user" mode • QoS = 1 with the "registered user" mode.

Synchronization Uncertainty in the Experimental Setup
In the proposed experimental methodology, the desired metrics are obtained combining timestamps taken by different actors working in a distributed system.Hence, the measurement uncertainty depends on several factors; among those, the major contributions are: the synchronization uncertainty among actors; the frequency uncertainty of the local oscillator in the physical device that takes the timestamp; and the uncertainty of the software delay of routines that correlate the event and the snapshot of the local clock to obtain the "timestamp".
Fortunately, the entire transaction (from Machine to Cloud and back) is expected to last few seconds; as a result, the contribution of the local oscillator uncertainty can be neglected (usually crystal deviation on short period is in the order of few parts per million).
In this paper, the Machine, the IIoT Gateway and the Cloud platform use the Coordinated Universal Time (UTC) as time reference.The NTP (Network Transfer Protocol) has been used to synchronize each system with a local time server locked to GPS clock.The synchronization uncertainty using NTP may vary with the quality of the network connection in terms of latency [22,23] but all the considered systems are connected via a local area network with time server, reducing the variability to the minimum.Nevertheless, the synchronization uncertainty of the devices involved in this use case is estimated in different ways.For the IIoT Gateway, the time synchronization is experimentally measured considering the residual time offset after compensation, as listed in the NTP statistics [23].For the Machine, the equivalent synchronization uncertainty is derived from Siemens documentation that sets the maximum error to 0.2 ms and by supposing the error distribution is uniform.Only the synchronization uncertainty of the IBM Cloud platform is difficult to be estimated, since no documents or literature is available on this topic.Anyway, it is hard to think that IBM cloud servers are worse than the other actors of the experiment case.For these reasons, the synchronization uncertainty of the IBM Cloud platform has been considered equal to the contribution of the IIoT Gateway.
The synchronization standard uncertainty results are shown in Table 1, where the experimental standard uncertainty is evaluated as the worst case because the systematic error sn μ introduced by the operating system is not compensated.Anyway, the resulting synchronization uncertainty is always less than 0.1 ms with respect to UTC.The timestamping uncertainty has been experimentally estimated at application level in the three actors.In details, a suitable software routine triggers a software delay at time T9 and takes a timestamp T10 when such a delay expires.Since timestamps and delay are obtained with the local system clock, the quantity Δ = T10 -T9 should be theoretically identical to the imposed delay value.
In truth, the timestamp uncertainty tn u affects both T9 and T10.Including the systematic error μ Δ , the timestamping uncertainty is

Synchronization Uncertainty in the Experimental Setup
In the proposed experimental methodology, the desired metrics are obtained combining timestamps taken by different actors working in a distributed system.Hence, the measurement uncertainty depends on several factors; among those, the major contributions are: the synchronization uncertainty among actors; the frequency uncertainty of the local oscillator in the physical device that takes the timestamp; and the uncertainty of the software delay of routines that correlate the event and the snapshot of the local clock to obtain the "timestamp".
Fortunately, the entire transaction (from Machine to Cloud and back) is expected to last few seconds; as a result, the contribution of the local oscillator uncertainty can be neglected (usually crystal deviation on short period is in the order of few parts per million).
In this paper, the Machine, the IIoT Gateway and the Cloud platform use the Coordinated Universal Time (UTC) as time reference.The NTP (Network Transfer Protocol) has been used to synchronize each system with a local time server locked to GPS clock.The synchronization uncertainty using NTP may vary with the quality of the network connection in terms of latency [22,23] but all the considered systems are connected via a local area network with time server, reducing the variability to the minimum.Nevertheless, the synchronization uncertainty of the devices involved in this use case is estimated in different ways.For the IIoT Gateway, the time synchronization is experimentally measured considering the residual time offset after compensation, as listed in the NTP statistics [23].For the Machine, the equivalent synchronization uncertainty is derived from Siemens documentation that sets the maximum error to 0.2 ms and by supposing the error distribution is uniform.Only the synchronization uncertainty of the IBM Cloud platform is difficult to be estimated, since no documents or literature is available on this topic.Anyway, it is hard to think that IBM cloud servers are worse than the other actors of the experiment case.For these reasons, the synchronization uncertainty of the IBM Cloud platform has been considered equal to the contribution of the IIoT Gateway.
The synchronization standard uncertainty results are shown in Table 1, where the experimental standard uncertainty is evaluated as the worst case u sn = µ 2 sn + σ 2 sn because the systematic error µ sn introduced by the operating system is not compensated.Anyway, the resulting synchronization uncertainty is always less than 0.1 ms with respect to UTC.
The timestamping uncertainty has been experimentally estimated at application level in the three actors.In details, a suitable software routine triggers a software delay at time T9 and takes a timestamp T10 when such a delay expires.Since timestamps and delay are obtained with the local system clock, the quantity ∆ = T10 -T9 should be theoretically identical to the imposed delay value.In truth, the timestamp uncertainty u tn affects both T9 and T10.Including the systematic error µ ∆ , the timestamping uncertainty is . Table 2 shows the timestamp standard uncertainty u tn of the considered system.
In particular, if the proposed approach is used (Equation 1 to Equation 6 are considered), the standard uncertainty u mn of any delay calculated between any two points (n and m), in the flow in Figure 1, is modelled as It is clear that it is always dominated by the timestamp uncertainty.

Use Case Results
The experimental setup was aimed at the estimation of all the metrics used as performance indicators by means of a single experiment.The experiment has been repeated changing the QoS settings as described in Section 4.1.The measurement campaigns took 14 days.A new measure loop is started every minute; when the roundtrip ends, all the delay values regarding the run are stored.Each experimental campaign has more than 8000 valid samples and the resulting performance indicators are reported in Table 3.The results of Table 3 are discussed in the following Sections 4.4 and 4.5, where the probability density functions of the measurements are also shown.

Discussion of the Use Case Result about Overall Delays
The detailed discussion of the overall delays is carried out only under the QoS setting called "quickstart", that is the setting offered by the Cloud platform of this use case.
The probability density function estimates of the OPC UA delay in the two directions (Machine to Gateway, OD MG e Gateway to Machine, OD GM ) are shown in Figure 5.In the Gateway to Machine direction there is a single, well defined, peak centred in the mean value; probably, the reception timestamp is assigned directly in the reception interrupt inside the Machine PLC, because the mean time is low.On the other hand, in the Machine to Gateway, direction the distribution shows three peaks with equal inter-distance of almost exactly 100 ms (please note that the peak centred at 280 ms is almost not visible in Figure 5).Given the analogy with similar situations [24], a multimodal shape of the OD MG distribution signifies that at least one of the tasks that manage the OPC UA data in the Machine to Gateway operates on the basis of discrete cycle times.Under such a hypothesis, it may be inferred that the OPC UA task inside the Machine PLC is executed cyclically every 100 ms and timestamps are assigned inside that task.
The probability density function estimates of Cloud messaging protocol end-to-end delay (MD) in the two directions (Machine to Gateway e Gateway to Machine) are shown in Figure 6.The behaviour is similar and the two distributions have only two main peaks.Finally, the probability density function estimates of the total end-to-end communication delay ED from Machine to the Cloud and from Cloud to Machine are shown in Figure 7.As expected, the distributions show several peaks, because they are the convolution of the distributions in Figures 5 and 6.
Machine to Gateway operates on the basis of discrete cycle times.Under such a hypothesis, it may be inferred that the OPC UA task inside the Machine PLC is executed cyclically every 100 ms and timestamps are assigned inside that task.
The probability density function estimates of Cloud messaging protocol end-to-end delay (MD) in the two directions (Machine to Gateway e Gateway to Machine) are shown in Figure 6.The behaviour is similar and the two distributions have only two main peaks.Finally, the probability density function estimates of the total end-to-end communication delay ED from Machine to the Cloud and from Cloud to Machine are shown in Figure 7.As expected, the distributions show several peaks, because they are the convolution of the distributions in Figures 5 and 6.Machine to Gateway operates on the basis of discrete cycle times.Under such a hypothesis, it may be inferred that the OPC UA task inside the Machine PLC is executed cyclically every 100 ms and timestamps are assigned inside that task.
The probability density function estimates of Cloud messaging protocol end-to-end delay (MD) in the two directions (Machine to Gateway e Gateway to Machine) are shown in Figure 6.The behaviour is similar and the two distributions have only two main peaks.Finally, the probability density function estimates of the total end-to-end communication delay ED from Machine to the Cloud and from Cloud to Machine are shown in Figure 7.As expected, the distributions show several peaks, because they are the convolution of the distributions in Figures 5 and 6.

Discussion of the Use Case Result Varying QoS Settings
The three QoS settings described in Section 4.1 can be compared.The delay ODMG and ODGM vary of few milliseconds in the three situations.The QoS settings (that only influence the Gateway to Cloud and Cloud to Gateway paths) do not affect the OPC UA drivers of the Machine and of the Gateway.For sake of completeness, it should be said that the ODMG distribution maintain the equidistant peaks with any QoS setting.
The probability density function estimates of the MDGC are shown in Figure 8, while the probability density function estimates of MDCG are shown in Figure 9. Here, the three distributions are clearly different.The behaviour of QoS 1 and QoS 0 in registered mode are typical of MQTT data transfers across Internet (as shown in [25][26][27]): the QoS 0 is faster but it has a larger standard deviation, while QoS 1 distribution is narrow with an higher mean value.The probability density function of the "quickstart" mode has the same shape of the QoS 1 in registered mode but has the highest mean value.
Moreover, during the experiments, there were some anomalous delay values for MDGC and MDCG; few samples (i.e., <0.2%) were well above three times the standard deviation and were marked as outliers.A careful analysis revealed that the anomalies appear only at the same hour during the night on different days and the number of outliers were greater in the "quickstart" mode than in the QoS 1 registered.In conclusion, the sporadic anomalous delays may be due to the "free access" version of IBM Bluemix platform that has no guaranteed quality of service but it is clear that, among others, the "quickstart" mode nodes have the lower priority.

Generalization of the Use Case Results
The sample use case demonstrates the feasibility of the proposed measurement methodology, giving also some directions for more general applications.
First of all, the used PLC (Siemens S7 1516) is a medium performance PLC whose characteristics can be found also in PLCs of other producers, including the OPC UA support; for this reason, the proposed approach can be implemented also with other control systems.
The gateway architecture that has been implemented with a Node.jsplatform can be easily ported to different hardware (e.g., Raspberry PI devices) and improved, from the performance point of view, by optimizing the OPC UA library.However, the logic flow of the measurement methodology remains the same.
Last, the cloud platform can be changed (e.g., Microsoft Azure, or Amazon S3) provided that a suitable messaging protocol can be supported.MQTT is generally available but AMQP (Advanced  Message Queuing Protocol) can be also considered.Anyway, the proposed methodology is not bonded to a specific messaging protocol, guaranteeing the consistency of the results.Moreover, three important general observations arise from analysing the results of the use case.
The QoS of the OPC UA stack is clearly related to the implementation inside the PLC and, thus, their up/down scale is expected depending on the ratio between PLC computational performance and computational load.
The concern about performance of the Cloud is more about the availability (lack of timely response in some cases) than latency.The QoS at this level is closely related to the messaging protocol performance through the Internet, while Cloud computational power practically does not affect the results (since the proposed methodology correctly decouples it).
The performance of the Gateway is not stressed in the considered use case but a reasonable dependency of the results from the throughput (in terms of message per second) is expected.In large applications, the gateway must scale his performance accordingly with the desired number of data exchanges with the cloud.Moreover, during the experiments, there were some anomalous delay values for MD GC and MD CG ; few samples (i.e., <0.2%) were well above three times the standard deviation and were marked as outliers.A careful analysis revealed that the anomalies appear only at the same hour during the night on different days and the number of outliers were greater in the "quickstart" mode than in the QoS 1 registered.In conclusion, the sporadic anomalous delays may be due to the "free access" version of IBM Bluemix platform that has no guaranteed quality of service but it is clear that, among others, the "quickstart" mode nodes have the lower priority.

Generalization of the Use Case Results
The sample use case demonstrates the feasibility of the proposed measurement methodology, giving also some directions for more general applications.
First of all, the used PLC (Siemens S7 1516) is a medium performance PLC whose characteristics can be found also in PLCs of other producers, including the OPC UA support; for this reason, the proposed approach can be implemented also with other control systems.
The gateway architecture that has been implemented with a Node.jsplatform can be easily ported to different hardware (e.g., Raspberry PI devices) and improved, from the performance point of view, by optimizing the OPC UA library.However, the logic flow of the measurement methodology remains the same.
Last, the cloud platform can be changed (e.g., Microsoft Azure, or Amazon S3) provided that a suitable messaging protocol can be supported.MQTT is generally available but AMQP (Advanced Message Queuing Protocol) can be also considered.Anyway, the proposed methodology is not bonded to a specific messaging protocol, guaranteeing the consistency of the results.
Moreover, three important general observations arise from analysing the results of the use case.
The QoS of the OPC UA stack is clearly related to the implementation inside the PLC and, thus, their up/down scale is expected depending on the ratio between PLC computational performance and computational load.
The concern about performance of the Cloud is more about the availability (lack of timely response in some cases) than latency.The QoS at this level is closely related to the messaging protocol performance through the Internet, while Cloud computational power practically does not affect the results (since the proposed methodology correctly decouples it).
The performance of the Gateway is not stressed in the considered use case but a reasonable dependency of the results from the throughput (in terms of message per second) is expected.In large applications, the gateway must scale his performance accordingly with the desired number of data exchanges with the cloud.

Conclusions
The success of Industrial Internet of Things (IIoT), from the economical point of view, depends on several factors.Among others, the "short latency" can be one of the most interesting, especially in the industrial market that is used to the "real-time" concept.This paper deals with a methodology to measure time delay metrics in OPC UA systems in order to study the impact that quality of service parameters have on the communication delay from the production line to the Cloud and vice versa.By means of a sample use case, the proposed method was applied, its feasibility was demonstrated and the results are generalized.In the use case, a Gateway exploiting the widely accepted OPC UA was used for data access directly in the devices inside the production line.The experimental results show that the overall delay is always bound to less than 300 ms, while the impact of the QoS parameters on the communication delay is clearly visible.The major experimental evidence in the medium term (14 days) is that the average delay from production line to Cloud is tightly related to the QoS settings of the IBM Bluemix platform.The "quickstart" mode has the worst performance with an average delay of 290 ms from Machine to Cloud and 170 ms in the opposite direction.If QoS 0 "registered user" mode is used, the average delays decrease, respectively, down to 220 ms and 80 ms.The QoS 1 "registered user" mode has higher delays but a lower standard deviation.Finally, it should be highlighted that the "free access" version of IBM Bluemix platform has no guaranteed quality of service: some samples (>0.2%) may be delayed by several seconds.

Funding:
The research has been partially funded by research grant MIUR SCN00416, "Brescia Smart Living: Integrated energy and services for the enhancement of the welfare" and by University of Brescia H&W grant "AQMaSC"

Figure 1 .
Figure 1.The block diagram of the setup for the experiment about the impact of quality of service parameters on the communication delay between a Machine with OPC UA interface and a Cloud platform.Black arrows show the path from the Machine to the Cloud, while blue arrows represent the reverse path.The red arrow is the software loop in the cloud that enable automatic bidirectional experiments.

Figure 1 .
Figure 1.The block diagram of the setup for the experiment about the impact of quality of service parameters on the communication delay between a Machine with OPC UA interface and a Cloud platform.Black arrows show the path from the Machine to the Cloud, while blue arrows represent the reverse path.The red arrow is the software loop in the cloud that enable automatic bidirectional experiments.

Figure 2 .
Figure 2. Node-RED flow for the IIoT Gateway related to the path from Machine to Cloud.The timestamps T2 and T3 are taken in this flow.

Figure 3 .
Figure 3. Node-RED flow for the IIoT Gateway related to the path from Cloud to Machine.The timestamps T6 and T7 (T7 = T6) are taken in this flow.

Figure 2 .
Figure 2. Node-RED flow for the IIoT Gateway related to the path from Machine to Cloud.The timestamps T2 and T3 are taken in this flow.

Figure 2 .
Figure 2. Node-RED flow for the IIoT Gateway related to the path from Machine to Cloud.The timestamps T2 and T3 are taken in this flow.

Figure 3 .
Figure 3. Node-RED flow for the IIoT Gateway related to the path from Cloud to Machine.The timestamps T6 and T7 (T7 = T6) are taken in this flow.

Figure 3 .
Figure 3. Node-RED flow for the IIoT Gateway related to the path from Cloud to Machine.The timestamps T6 and T7 (T7 = T6) are taken in this flow.

Figure 4 .
Figure 4. Node-RED flow for the Cloud application.Data is received from the IIoT Gateway and sent back after a fixed delay (one minute).The timestamp T4 and T5 are taken in this flow.

.
Table 2 shows the timestamp standard uncertainty tn u of the considered system.

Figure 4 .
Figure 4. Node-RED flow for the Cloud application.Data is received from the IIoT Gateway and sent back after a fixed delay (one minute).The timestamp T4 and T5 are taken in this flow.

Figure 5 .
Figure 5. Probability density function estimate of the OPC UA end-to-end delay (OD) in the two directions (Machine to Gateway e Gateway to Machine).

Figure 6 .
Figure 6.Probability density function estimate of the Cloud messaging protocol end-to-end delay (MD) in the two directions (Machine to Gateway e Gateway to Machine).

Figure 5 .
Figure 5. Probability density function estimate of the OPC UA end-to-end delay (OD) in the two directions (Machine to Gateway e Gateway to Machine).

Figure 5 .
Figure 5. Probability density function estimate of the OPC UA end-to-end delay (OD) in the two directions (Machine to Gateway e Gateway to Machine).

Figure 6 .
Figure 6.Probability density function estimate of the Cloud messaging protocol end-to-end delay (MD) in the two directions (Machine to Gateway e Gateway to Machine).

Figure 6 .
Figure 6.Probability density function estimate of the Cloud messaging protocol end-to-end delay (MD) in the two directions (Machine to Gateway e Gateway to Machine).

Figure 7 .
Figure 7. Probability density function estimate of the total end-to-end communication delay ED from Machine to the Cloud and from Cloud to Machine.

Figure 7 .
Figure 7. Probability density function estimate of the total end-to-end communication delay ED from Machine to the Cloud and from Cloud to Machine.

4. 5 .
Discussion of the Use Case Result Varying QoS SettingsThe three QoS settings described in Section 4.1 can be compared.The delay OD MG and OD GM vary of few milliseconds in the three situations.The QoS settings (that only influence the Gateway to Cloud and Cloud to Gateway paths) do not affect the OPC UA drivers of the Machine and of the Gateway.For sake of completeness, it should be said that the OD MG distribution maintain the equidistant peaks with any QoS setting.The probability density function estimates of the MD GC are shown in Figure8, while the probability density function estimates of MD CG are shown in Figure9.Here, the three distributions are clearly different.The behaviour of QoS 1 and QoS 0 in registered mode are typical of MQTT data transfers across Internet (as shown in[25][26][27]): the QoS 0 is faster but it has a larger standard deviation, while QoS 1 distribution is narrow with an higher mean value.The probability density function of the "quickstart" mode has the same shape of the QoS 1 in registered mode but has the highest mean value.Electronics 2018, 7, x FOR PEER REVIEW 12 of 14

Figure 8 .
Figure 8. Probability density function estimate of the Gateway to Cloud delay (MDGC) using different QoS parameters in IBM Bluemix.

Figure 8 .
Figure 8. Probability density function estimate of the Gateway to Cloud delay (MD GC ) using different QoS parameters in IBM Bluemix.

Figure 8 .
Figure 8. Probability density function estimate of the Gateway to Cloud delay (MDGC) using different QoS parameters in IBM Bluemix.

Figure 9 .
Figure 9. Probability density function estimate of the Cloud to Gateway delay (MDCG) using different QoS parameters in IBM Bluemix.

Figure 9 .
Figure 9. Probability density function estimate of the Cloud to Gateway delay (MD CG ) using different QoS parameters in IBM Bluemix.

Table 1 .
Synchronization uncertainty over an observation time of 14 days (ms).

Table 2 .
Timestamp uncertainty over an observation time of 14 days (ms).

Table 3 .
Delay for the considered use case over an observation time of 14 days (ms).