1. Introduction
IoT infrastructure is shaping the world in the way that the exchange of data between physical world items and the virtual world has become more and more common. It is said that the number of IoT devices connected to the network might be currently around 50 billion according to the source [
1]. The amount of their generated data could reach 79.4 zettabytes by 2025. It can be considered as a global network of an infrastructure with numerous multiple devices, which are made up of different parts including sensing, communication, networking, and information processing [
2]. However, long data transmission distances between the end-users and cloud servers lead to a higher network data flow, data loss, latency, and energy usage [
3]. This is where fog computing emerges with an intention to bring computation and storage capabilities to the edge of the network. A distributed fog computing infrastructure is effective, with delay-sensitive IoT applications rendering minimal latency and energy consumption to process data while using resource-limited fog/edge devices.
Fog computing has typically a layered architecture with three different layers: end-device (terminal) layer, fog layer, and cloud layer [
4]. An end-device layer consists of end-devices that are distributed around the accessible area. They are designed to collect data in order to transmit them to upper layers. A fog layer, meanwhile, is situated at the edge of a network between a fog layer and an end-device layer. It is designed for computing, filtering, joining, transforming [
5], and the storage of data. It can be both mobile and static. A cloud layer consists of storage devices and servers with high performance. However, the number of layers or their names in the architecture may slightly differ like in [
6], where an orchestrator has its own layer between a fog layer and an IoT application layer to make four layers in total. This paper [
7] suggests using two fog layers instead where the first one consists of small to medium calculation capacity nodes, and the second one is made of powerful ones. As for the fog radio access networks (F-RANs) to support 5G, there is a network access layer between a cloud layer and a logical fog layer [
8]. And five layers are identified in the reference architecture [
9]: (1) sensors, edge devices, and gateways, (2) network, (3) cloud services and resources, (4) software-defined resource management, and (5) IoT applications and solutions. Please see
Figure 1 for more details.
Even though the number of fog computing layers may sometimes slightly differ, the end-device layer will always consist of IoT devices such as smartphones, tables, computers, sensors [
10], remote machines, and smart vehicles [
11], which communicate in a wired or wireless manner [
12] with the fog layer. These end-devices may have limited resources [
13] such as storage and computational power, as well as limited bandwidth.
Multi-agent systems (MASs) in turn, as the review article [
14] claims, gained excessive attention from scientists of different areas such as a computer science and civil engineering. They can solve complex tasks by breaking them down into smaller ones. Each task can be assigned to agents that function independently from each other. Agents can make decisions on taking actions based on their action history, interactions, and its goal. Multi-agent systems recently have been a popular research topic, and they are applied in such areas as unmanned aerial vehicles, industrial internet of things, and wireless sensor networks as per the publication [
15]. However, as it added in the same publication, irrespective of all the benefits, multi-agent systems are very vulnerable to network attacks due to an open communication environment and the complexity of the system. It lacks an integrated system to monitor and manage the activities of all the network nodes. Information exchange is usually very high, but the information flow cannot be verified, and therefore the system is at a security risk.
This paper includes the following sections: Introduction, Related Work Review, Service Placement Orchestrator Implementation, Materials and Methods, Results, Discussion, and Conclusions. Apart from general information in the Introduction, the Related Work Review section gives more focused details of the problems and potential solutions in other research works. Orchestrator design is defined in the Service Placement Orchestrator Implementation section based on relevant characteristics. The chapter of Materials and Methods discusses the hardware and software as well as test methods. The Results, Discussion, and Conclusions sections give an insight into the experimental results, followed by assumptions and result conclusions.
2. Related Work Review
Computing is still a developing paradigm and it has plenty of challenges to overcome. A review publication [
1] identified a number of such challenges, but some of them include mobility, scalability, availability and reliability, resource management, application placement strategies, and security and privacy. Its heterogeneous nature may lead to structural challenges. Some fog nodes may have limited resources; therefore, there is a need to develop distributed applications [
4]. Additionally, it can be difficult to maintain service access authentication to preserve privacy. End-devices are closer to their users; therefore, they can collect more sensitive data. It raises privacy concerns for end-users. Private location and personal data can be disclosed by an untrusted party hacking a poorly protected node [
16].
Applications usually have some objectives and constraints [
17]. A set of objectives can be conveyed as diverse QoS requirements. Constraints, in their turn, can be application-related or network-related. It may include meeting the deadline of applications, bandwidth requirements, security, privacy, power consumption etc. Fog devices are highly distributed and resource constrained [
18], which is opposite to Cloud Computing. One of the key problems to run an application in the fog environment is resource allocation. The purpose of this is to select devices that have available resources for the required application services. Resource allocation can be mainly divided into three categories: resource placement, resource scheduling, and resource migration. To be more specific, resource placement defines where to place resources, resource scheduling defines the time when it has to happen and the scale of resources, meanwhile resource migration determines where these resources can be moved. While doing this, resource monitoring and metrics have to be considered.
Service placement has to be as optimized as much as it is possible for fog nodes to use their resources efficiently [
17]. The main purpose of an optimization is to reduce or increase certain features based on the objectives [
19]. The optimization itself by its nature can be (a) heuristic, (b) metaheuristic, (c) machine learning, (d) mathematical programming, and diverse [
17]. A heuristic approach offers a sub-optimal solution with the consideration of time. A metaheuristic approach provides an optimal solution with methods that are nature inspired and are not focused on a local optimum. Machine learning is overly dependent on network resources and the quality of training, meanwhile mathematical programming is meant for a single performance parameter optimization. The PSO algorithm as a metaheuristic method seems to be well suited to a multi-objective optimization, except that it can lead to a premature convergence and a local optimum when a diversity of the population is insufficient [
20], which may require some extra attention.
Optimization can be based on a single-objective problem and a multi-objective problem. A multi-objective optimization can lead to trade-offs to fulfill conflicting goals [
21] simultaneously, while taking constraints into consideration [
22]. There is no single best solution. It gives a set of solutions. And, there is a large number of centralized optimization techniques as the paper [
23] suggests. However, centralized solutions require a centralized controller to keep track of the global system information. But the problem with a centralized solution is that if a controller node fails, the whole system fails. It creates a single point of failure [
24]. Decentralized or a distributed design can be an answer to his issue since multiple fog controllers are involved in making a placement decision. The downside of this approach is that it takes additional efforts to identify the fog nodes fit enough to take a controller node role. And, service placement task scheduling algorithms can be divided into immediate, batch, preemptive, non-preemptive, static, and dynamic [
25]. What dynamic service placement algorithms set apart is that service discovery mechanisms are defined in such a way that available services are known to other services due to their interaction [
26]. Each of the service (microservice) enquiries are dynamically updated in the service registry to determine availabilities.
A service placement solving technique can also be offline and online [
27]. In an offline technique, all the requirements and constraints are known in advance. To be more precise, it can be said that a placement decision is made in the compile time [
28]. Meanwhile, online placement decisions are made in the runtime. It is, however, more beneficial to consider a placement as an online technique. It is more dynamic and capable of reacting to current changes in the infrastructure.
Dynamic networks are the ones where both the mobile fog nodes and the end-users change their characteristics within the time, including network topology changes [
29]. A fog computing infrastructure has to be mobility aware, but it is challenging due to its dynamics. Dynamicity can be related to the fog infrastructure dynamicity and to the application dynamicity [
28]. Fog computing architecture is highly dynamic [
20] and its nodes may join or leave the network any time. But the resources in that particular node may not be sufficient to host a requested service that leads to a lower QoS, longer response time, or even service failure. Applications, therefore, should be deployed/removed dynamically while considering the capabilities of the changing fog infrastructure [
28].
Please see
Table 1 for a review summary.
There are numerous research papers trying to solve a service placement or an application placement using a dynamic approach. Some of them use a distributed control method instead of a centralized or federated one. Some of them focus on resilience. But there are almost none of them that are dedicated to a dynamic service placement in a distributed way with an intention to keep it more resilient due to a distributed approach.
3. Service Placement Orchestrator Implementation
3.1. Design Motivation
The aim of this orchestrator design process is to demonstrate how orchestrators make decisions to control their services in respect to the QoS and security requirements. Each decision the orchestrators make to start/stop/move their services must be verified in a dynamic way, whether the minimal requirements related to various hardware and software restrictions of the involved hardware devices, as well as requirements due to peculiarities of the application area (e.g., sensitive data should be protected better than environment monitoring data), are met.
The orchestrator design should consider its three main stages:
Each orchestrator as a part of the first stage should take into account such requirements as security, CPU, RAM, and power based on the application area and diverse fog node hardware/software capabilities to decide if it is possible to launch all required services without violating these requirements;
The second stage is to find an optimal distribution for deployable services among different fog nodes. It can be vital for saving energy and computation resources in cases when some services need to be stopped, suspended, or moved to other fog nodes;
The third stage is a dynamic service placement for a situation when circumstances change during the runtime, and orchestrators need to change the distribution of their services among available fog nodes according to these new conditions.
The orchestrator in the first stage should have all the information of services and placements in all the fog nodes collected and synchronized, and should be aware of the available QoS and security requirements. If all minimal requirements are satisfied, services can be launched. During the second stage, an orchestrator should search for an optimal placement of its services. The orchestrator should detect in the third stage certain changes in its resources or environment indicators, process this data, and relocate its services from the current fog node to another one.
Having considered design requirements and knowing the benefits of multi-agent systems, it provides a good starting point. MAS behaves like a network that can correct itself and analyze itself [
51]. These intelligent network nodes can both function individually and cooperate to pursue their general and individual goals. The integration of a MAS constitutes a complex framework designed for system control and optimization. However, security issues must be addressed to maintain its resilience, and resource monitoring needs a solution due to the lack of an integrated tool.
Particle Swarm Optimization (PSO) is a population-based metaheuristic technique that is used to solve optimization problems [
52]. It imitates a social behavior of birds where each bird in the flock, based on its individual experience and social experience, approaches their target food. It is a principle of social interaction to solve the problem. This technique is good to optimize continues non-linear functions. As in the wild with a flock of birds, here PSO starts with a swarm of potential solutions. Each potential solution is represented by a particle. The population with each iteration is updated by updating the particle’s velocity and position. These updates are based on the personal best value and global best value. Each particle converges to its new position until the global optimum is found. Multiple objectives, however, require IMOPSO for a set of non-dominated service placements.
The Analytical Hierarchy Process (AHP) is used as a second technique to choose the best solution from a Pareto set. AHP, which was developed by Saaty, is a technique that helps to simplify complex and poorly structured problems by making a number of pairwise comparisons [
53]. Decision criteria are organized in a hierarchical way, and they are given their weight coefficients based on a potential impact to achieve the desired goal. At the end of the process, the best service distribution alternative is chosen, which corresponds to the highest criteria priority as the final optimization process output.
In order to meet design requirements, the whole orchestrator design process is broken down into separate subsections, which include resource monitoring, starting new services, data synchronization, security maintenance, and the orchestrator architecture itself. All these subsections are needed to design a service placement orchestrator as a multi-agent system that overcomes its inherent shortcomings of network attack vulnerabilities and the absence of an integrated monitoring tool.
3.2. Resource Monitoring
Resource monitoring is implemented using a monitoring agent. It uses a cyclic behavior to wait for messages. The content of these messages is filtered with the method startsWith() in order to take a relevant action. It can obtain messages from a battery voltage agent (BattVoltAgent), light agent (LightingAgent), or a sensor agent (AbstractSensorAgent). A battery voltage agent keeps track of Raspberry Pi 4 battery charge level as a fog node. Pi 4 does not have a native analogue-to-digital (ADC) conversion option. An external one such as the MCP3424 18-Bit ADC-4 channel converter would be needed. However, a high or low pin 3 voltage is checked for simulation purposes using the Pi4J library.
A sensor agent is used to monitor such external resources as light intensiveness or temperature etc. A 5 s interval is used as a sensor update interval, which involves communication between end-devices and particular agents and as a sensor agent poll interval, which involves communication between agents and the orchestrator. There are a few commands that a sensor agent can receive like Get, Set, Move, or Changeip. If a sensor agent receives a Get message, it identifies a sender and responds with a value that is obtained using the getValue() method as an ACL INFORM message. These messages are sent as a part of a regular sensor polling task at a predefined interval like 5 s. It can be also triggered once a threshold value is reached.
Communication with the end-devices to obtain their values is completed using the COAP protocol. Lighting and temperature agents keep on polling their end-devices for their values as a part of an onTick() event, which is periodically triggered after a timeout interval. The COAP request method GET is used to obtain a value from the end-point device. The PUT method is used to set a value instead.
As in
Figure 2 shown above, an end-device keeps on periodically communicating with its fog node. Once a monitoring agent detects that its battery voltage is below a required level, it sends a service redistribution request to the fog node Decision Maker agent. A remote service redeployment is calculated using IMOPSO and AHP, and its solution is communicated to the Execution agent. Services in the current fog node are stopped. A redeployment solution is synchronized by the current fog node and the fog node where the services have to be moved to. A Synchronization agent sends a request to its Execution agent and the end-device service is assigned to the fog node 1.
3.3. Starting New Service
Service requests are generated when a new end-device appears, or the current end-device is moving from one place to another. End-devices at the current fog node are disconnected and they need to send a request to a closer fog node. These end-devices are identified by their end-point address such as
coap://192.168.0.24/temp for a temperature device or
coap://192.168.0.24/led for lights, which can be changed or adjusted as required. Please see
Figure 3 for more details.
The light agent works as a light sensing element. If the light level outside falls below a certain threshold, it sends a request to its Service agent to start a new service. The Service agent defines what that service should be like. When that is a light service, it can define what intensity of the light is needed. The Service agent afterwards sends a defined request to the Bulb agent. It uses the PUT method to communicate with the end-device to set a required level of light. As the level is adjusted, the outcome is synchronized.
3.4. Data Synchronization
Currently, synchronization among orchestrator agents and communication among fog nodes is implemented as a three-way communication topology. Upwards communication is bound for a vertical synchronization with a parent agent by going one level up within the hierarchy. Downwards communication is meant for sending messages down by one level to a child agent. This is completed by sending ACL INFORM messages within internal fog node agents. Sideways communication is horizontal communication among nearby fog nodes, and it is being completed by Synchronization agents. All the fog nodes need to keep their data about resources and statuses updated to make their informed decisions for the best possible service placement when starting new services. This is also necessary when currently available services are getting relocated due to resource or security restrictions. All the horizontal communication now is being completed via Wi-Fi, but it can also be completed via Bluetooth, ZigBee, or even Ethernet. Please see
Figure 4 for more details.
To keep an agent list synchronized, it is updated using the AMS Service component. This component allows us to launch a search based on certain constraints and descriptions. It monitors agent registration, deregistration, and tracks them. A GetAgentList message is sent to all known orchestrators in the fog nodes to retrieve a list. Meanwhile, there is a list of possible orchestrators with their IP addresses stored, but their presence in the fog network is confirmed with a response. Please see
Figure 5 for more details.
3.5. Security Maintenance
As a default configuration, agent communication messages are neither encrypted nor signed. It may give an opportunity for a hacker using a malicious agent to sniff a message content or even modify it. It would lead to malicious instructions for a recipient agent. Different requests can be sent to AMS to eliminate some agents if there are no security checks. To maintain security, the JADE security add-on JADE-S was used. After it is installed, services such as SecurityService, PermissionService, SignatureService, and EncryptionService have to be enabled. SecurityService is a primary one, and the other ones are optional and can be enabled as required.
Instead of using the method send(), the method sendMessage() is used. It allows us to specify whether the message has to be signed and encrypted. Credentials are checked by the method retrievePrincipal(). It is triggered first before a relevant message is sent. It first sends a
request message with the content “get-principal”, and the answer is formed by a recipient to send it back as an
inform message. Please see
Figure 6 for more details.
In addition to the above information, further modules have to be enabled to adjust default security settings to preferred ones. The SignAlgorithm allows us to choose an algorithm that will be used to have the messages signed such as SHA1withRSA, MD5withRSA, DSA etc. The size for public and private keys can be defined by the module AsymKeySize, which ranges from 512 as a default value to 2048. There are a few more modules, in addition to the above ones, used to ensure additional customizability.
3.6. Dynamic Orchestrator Architecture
A service placement orchestrator was implemented as a multi-agent system (MAS) Java application using a JADE framework. It allows us to develop MAS applications that comply with FIPA specifications. It offers such features as an agent abstraction, asynchronous messaging, and a service discovery based on the yellow pages method. The orchestrator is implemented as a distributed monitoring, decision making, and execution model, which is available in each fog node within a relevant fog computing system. Continuous resource monitoring and request processing allow us to make informed decisions in a dynamic way. Service distribution among multiple nodes contributes to the enhancement of computational output, power usage, and resilience. A distributed orchestrator is more resilient than a centralized one because of the absence of a single point of failure (SPF). Mobility, changing resource levels, and fog node failures preferably lead to dynamic and distributed decision making.
Once the fog nodes connect to the same network to form a shared infrastructure, they start monitoring their resources such as CPU, RAM, battery, and security. The Monitoring agent waits for resource-related messages using a cyclic behavior. These messages are classified by use cases and tagged with relevant resource levels. A Request agent is waiting for service request messages. They are also classified by use cases. Due to a security add-on, JADE-S can be either signed, encrypted, or both signed and encrypted. Required security measures and used security measures are identified among involved agents and the messages can be discarded if these requirements are not met. Please see
Figure 7 for more architecture details.
The Decision Maker agent can obtain messages either from the Request agent, Resource Monitoring agent, or from the Synchronization agent. Messages from a Request agent can be related to a new service request or a current service redeployment. The Monitoring agent keeps its Decision Maker informed if resources are below a required level. There is also a Synchronization agent that keeps resource data synchronized among the involved fog nodes to let a Decision Maker make informed decisions when choosing a local or a remote service distribution. The Decision Maker uses IMOPSO and AHP algorithms to calculate the best deployment based on the objective functions. IMOPSO is used as a primary request processing algorithm for a higher number of options. Once these options are considered by the Decision Maker, a potential solution is further processed with a reference to objective functions. The AHP algorithm uses its criteria matrix to make a final placement decision based on prioritized criteria. Four criteria are used at the moment, but this number can be adjusted as required.
As soon as a placement decision is made, it is communicated to the Execution agent. It runs a cyclic behavior waiting for its messages. Received messages are classified based on use cases or key words such as “Decision” for internal purposes. If there is a local redeployment within the same JADE platform, agents can be moved from one container to another. Agents can not be moved, however, to another platform. When services have to be redeployed remotely, current agents are killed and the other ones with the same parameters are created in a remote platform. Any changes in the fog computing infrastructure, whether they involve a new service placement or a remote redistribution, are synchronized by a Synchronization agent. Additional details about the architecture are available in the publication [
54].
3.7. Service Placement Decision-Making Method
The method that was used for a service placement in this research was proposed in [
55], and it is made of two stages. The first stage uses IMOPSO and the second one uses AHP. IMOPSO is a slightly adapted version of MOPSO, which was originally introduced by Coello et. all in [
56]. AHP was first introduced R. W. Saaty in [
57]. The purpose of this two-stage method is to distribute
n services among
k fog nodes. All the QoS features of the
i-th placement for
Xi are defined by objective functions
,
. An optimization process seeks to find the best service placement
while minimizing its objective functions f
j:
As a part of fog computing paradigm, fog nodes can have different technical capabilities, network throughput, and security units. Such qualities to consider may include CPU, RAM, power usage, network range, communication security protocols and authentication etc. Optimization based on one feature may come at the expense of neglecting other ones. Therefore, there is no universal solution to a multi-objective task. This leads to obtaining a set of non-dominated solutions based on fog node constraints. Please see
Figure 8 below for more details.
The IMOPSO method is used to obtain Pareto potential solutions. Since these placements, which are added to a repository, are non-dominated, it means that they are better by some criteria scores. AHP is used for the final best placement based on prioritized criteria. Alternatives are compared with each other using a judgement matrix. More details about an implementation of the IMOPSO algorithm and AHP process are provided in the publication [
55]. A short summary is given below.
The IMOPSO algorithm is applied in the following way as it is in
Figure 9:
A swarm is generated based on predefined test parameters such as a number of objective functions, particles, epochs, inertia weight, cognitive coefficient, social coefficient, number or services, and a range of particles. Particle positions with the swarm are randomly assigned. Initial values such as a global best score and position as well as velocities are given;
Each particle position gets evaluated for its new individual score based on its objective function res = (x − y)2;
Velocity is updated using the formula oldVelocity[i] = inertia + (pBest[i] − pos[i]) × cognitiveComponent × r1 + (gBest[i] − pos[i]) × socialComponent × r2;
Individual particle positions are updated by adding its velocity to its position;
Searching for individual best scores and individual best particle positions within the swarm;
Searching for the global best score and global best position;
If a particle dominates or if it neither dominates nor is dominated, it is added to a repository as an individual best result or global best result, respectively;
Repeat until the required number of cycles is completed.
The AHP algorithm is applied in the following way as it is in
Figure 10:
Repository particle positions in prepo are pairwise compared. This is completed with each criterion (objective function) fs[i];
Comparison results are stored in the array allComp[i];
The priority vector eigenvectorC is to be calculated. It is a matrix normalized Eigen vector. Each column of the priority matrix Ccomp is summed up. Each element of the column is divided by its sum to obtain a normalized relative weight in the double array matrixnormC;
The transposed matrix eigenFinal is used to find the best particle position. The best service distribution is the alternative that has the highest level of priority.
5. Results
A few types of experiments were performed to test the performance of the method that was presented in the publications [
54,
55]. Choosing the right coefficients such as an inertia weight means an optimal balance between the lowest number of global optimum failures and the shortest possible execution duration. A criteria number in its turn defines the range of a matrix for a pairwise comparison. A few criteria may initially be enough, but the number may increase as the system complexity increases. Stress tests are meant to take into account the availability of resources, which may vary due to some background processes. A security package and its services trigger additional calculations. Therefore, it may or may not have a significant impact on the response time. And finally, due to the mobile nature of fog devices and their limited energy resources, power consumption is considered and tested.
For the IMOPSO algorithm to function as optimally as it is possible, it is necessary to consider such coefficients as inertia, cognitive, and social. Inertia weight was first introduced in [
62]. The purpose of the inertia weight
w is to balance between the local search and the global search. Please see
Figure 13 for more details.
As a default setting, 4 simulated fog nodes, 12 services, and 200 epochs were used. Particles ranged from 50 to 500. The range of the inertia weight was 0.3 to 1.3. The cognitive and social coefficient was 1.499, which had a slight adjustment to the value of 1.496180. As per publication [
63], it leads to convergent behavior. As is visible from the test results, the inertia weight coefficient 0.6 demonstrated the best outcome in finding a global optimum. It can be considered as a threshold value, which will be used in further experiments.
Different fog nodes might have different technical characteristics since heterogeneity is a part of fog computing. They also might have constrained resources and different QoS requirements. We use a judgement AHP matrix to evaluate such a situation and to prioritize different criteria. All the experiments are mainly completed using a four-criteria matrix. It includes power, CPU, security, and RAM. Power primarily, in our experiments, is expressively prioritized over other criteria. There is no particular reason for this, and any criteria can be adjusted as required.
The purpose of the experiment below is to test how much delay can a certain number of criteria contribute to our optimization algorithm. As a default setting, 4 simulated fog nodes, 12 services, 50 particles, and 200 epochs were used. A PC and a Raspberry Pi 4 are used as physical fog nodes to compare a powerful fog node running a simulation test and a fog node with limited hardware resources running a simulation test. Please see
Figure 14 for more details.
A range of 3 to 5 criteria is mostly expected. However, to test a broader scope, the range of 10 to 30 is included. Furthermore, 3 to 5 criteria generate only 3 to 10 comparisons within the matrix. This has little effect on the response time. However, 10 to 30 criteria generate 40 to 435 comparisons, and the response time significantly rises in a hardware-restricted fog node device. Still, a 1 s response time is manageable in real applications even though 30 criteria are barely needed.
The following experiment is meant to test the effect of CPU availability on the response time. The goal is to consider CPU utilization in such a case when additional services are to be hosted or some background processes take place, which overload the CPU. In order to learn the extent and a threshold to which a CPU can be overloaded by additional services, a CPU stress test was performed. Please see
Figure 15 for more details.
Raspberry Pi 4 was used as a physical fog node. The optimization algorithm performance evaluation was completed using 4 simulated fog nodes, 12 services, and 200 epochs as a default setting. The number of particles ranges from 50 to 500. To perform a CPU stress test, a stress tool was installed. A CPU was stressed with 0 to 6 workers, which means that concurrent process threads that are launched in CPU overload it. The algorithm itself overloads Pi 4 CPU at around 30% once it is started. In total, 0 to 2 workers do not seem to pose any negative effect on a CPU. However, 4 to 6 workers that overload a CPU up to 100% are critical. Services would have to be redistributed before such a level is reached.
The following experiment is meant to test the effect of RAM availability. The optimization algorithm performance evaluation was completed using 4 simulated fog nodes, 12 services, and 200 epochs as a default setting. The number of particles ranges from 50 to 500. Please see
Figure 16 for more details.
Raspberry Pi 4 background processes initially overload RAM with about 220 MB. Once the algorithm is launched, the level goes up to 250 MB. It suggests that the algorithm is not that dependent on RAM, as it is dependent on CPU. However, if there are some significant background processes or additional services, it can still slow down the algorithm and the fog node services need a redistribution.
The following experiment is completed to test a low battery level. An external analog-to-digital module would be needed for this since Raspberry Pi 4 does not natively offer an ADC capability to detect a certain level of voltage. However, a low or high input was used for the test purposes. The class BattVolt.java was created to test a battery voltage level. The library P4J was used to read an input value by the BattVolt agent. The polling interval by default was 5000 ms.
Using Raspberry Pi 4 as a low-resource fog node, it takes up to 1.5 s to complete the whole process of a service redistribution. Using a PC as a powerful fog node, it would require about 10 times lower duration. More details are available in the paper [
54]. CPU overloading does not seem to affect battery level sensing or service redistribution. However, a two-stage IMOPSO and AHP algorithm is sensitive enough to overloading. Still, a few seconds to redistribute fog node services due to a low battery level might not be a problem in real-life conditions. Please see
Figure 17 for graphical details and
Table 2 for measurement details.
The following five experiments are meant to test the security influence on a fog node response time using different security levels and two fog nodes with different hardware capabilities. JADE-S, a security add-on package, was used as a security option [
64]. This add-on allows us to develop multi-agent applications with a certain degree of security, including guaranteed message integrity, confidentiality, and authorization checks. It allows for agents and containers in a platform to be owned by authenticated users, which are authorized by a platform administrator. Each agent has a public and a private key pair, which is used to sign and encrypt messages.
The JADE security guide claims that the signing and encrypting of messages can slow down the agent communication performance and that this is the reason why it is not completed by default; it is important to check to what extent it can happen. An optimization algorithm performance evaluation was completed using 2 physical fog nodes and 12 services as a default setting. Services have to be moved from one fog node to another due to a low battery. The SecurityHeper() package is used with the methods setUseSignature() and setUseEncryption() to set a signature and encryption for a message. The methods getUseSignature() and getUseEncryption() of the same package are used to retrieve a signature and encryption. Default configuration values are used such as an RSA asymmetric algorithm and a 512-bit key size for public and private keys.
As the chart above in
Figure 18 suggests, four experiments with different security configurations were performed. It includes (a) no security package, (b) signed, (c) encrypted, and (d) signed and encrypted security levels. A PC was used as a high-resource fog node and services had to be moved from one fog node to another due to a low battery level. Such a remote service redistribution did not demonstrate any significant response time variation because of different security approaches. There is a slight response time increase mainly due to a battery level sensing time increase.
When a low-resource fog node is used such as Raspberry Pi 4, the response time increases by almost 10 times, as is observed in
Figure 19 above. It increases mainly by an impact of the placement finding algorithm, which is mostly resource prone as was witnessed in the publication [
54]. The usage of a security package does not seem to have any significant impact, apparently because of no communication between agents. The response time may increase slightly in redistribution and battery level sensing during their turn, as these processes involve a sensed state communication to a Decision Maker and the communication of a decision made by a Decision Maker to Execution agents and finally Synchronization agents.
In order to increase a security level, security parameters can be adjusted instead of using default ones. This can be achieved by launching addition services. The following experiment is meant to test the impact of a key size on the response time. The same scenario of remote service redistribution due to a low battery is used. Raspberry Pi 4 and a PC were used as two different fog nodes. Three different key length configurations were used with each fog node. The service jade.security.AsymKeySize has to be launched and the key length options are 512, 1024, and 2048.
As it is visible in
Figure 20, the size for public and private keys did not have any significant impact on agent communication performance. There might be some slight increase in time due to procedures that involve more communication of agents, as in redistribution and battery level sensing, but the biggest impact is made by the placement finding method.
Messages can be intercepted and read if they are exchanged as plain text. Therefore, additional measures to encrypt them are beneficial. The following experiment is meant to test the impact of a message encryption algorithm on the response time. The same scenario of remote service redistribution due to a low battery is used with Raspberry Pi 4 and a PC as two different fog nodes. Four different symmetric algorithm configurations were used for messages with each fog node. The service jade.security.SymAlgorithm has to be launched and the chosen options were AES, Blowfish, DES, and TripleDES. More details are available in
Figure 21.
As is visible from the chart above, a message encryption algorithm did not have any significant impact. One of potential reasons might be the messages themselves that are being communicated, since they are very short. Short messages might not be resource demanding. Long messages would give a better idea, but these are simple agent communication commands and no increase in their length is needed.
The following experiment is meant to test the impact of a key pair algorithm on the response time. Two different asymmetric algorithm configurations were used to generate key pairs with each fog node. The service jade.security.AsymAlgorithm has to be launched and the options to choose from are RSA, which is the default one, and DSA. More details are available in
Figure 22 below.
RSA and DSA do not seem to have any significant advantage over each other, allowing us to conclude that the usage of security add-on does not contribute to a higher response time, as is suggested in the Jade security guide. Moreover, the usage of other security configurations rather than default ones does not obviously contribute to a higher response time either.
It is important to maximize the overall runtime of the system. Mobile fog nodes are known for scarce power supply and therefore resources have to be used in an optimal way. The best service distribution from the perspective of power is the one used when nodes are evenly loaded to keep the whole system available as long as possible.
The following experiment is meant to measure the power in watts that is needed to obtain a service placement solution. For the method to be physically available, it has to be running in a fog node. Raspberry Pi 4 was used as a host for the JADE platform. Measurements were made solely with a running Pi 4, and with launched JADE working on its tasks in the Pi 4. Power needs were compared with a regular 40 W bulb for illustrative purposes. An official power supply adapter with the output of 5.1 V and 3.0 A was used. The USB tester UNI-T UT658B was used to measure the voltage and amperage, which served as a power input for Pi 4. Please see
Figure 23 for more details.
A remote service redistribution scenario was used. In total, 5.1 V and 0.5 A were needed to keep Raspberry Pi 4 running. It is equal to 2.55 W of power. If JADE was launched and no optimization algorithms were running, power consumption settled to the same 2.55 W after a few seconds. However, once an optimization process begins, voltage stays more or less the same but the current increases to 0.62 A, which is equal to 3.162 W. More details are given in
Table 3 below.
A very tiny amount of energy is needed and the costs to maintain such a fog node are minimal as well. Even if the optimization process is running without any interruptions for the whole month, which is barely required in real-life conditions, self-cost will be around just 0.1 EUR, considering that the price for electricity is 0.214 EUR/kWh. However, power demands did not vary on the type of tasks that were running such as battery level sensing, service placement finding, or redistribution. This suggests that only the active time of the optimization should be considered as the whole instead of breaking it into different processes.
The following experiment was conducted to analyze a service placement solution energy consumption based on task scalability. An optimization model was run 20 times with each parameter set using Raspberry Pi 4 as a fog node. Fog computing system complexity was gradually increased by increasing the number of fog nodes, services, particles, and epochs. An average value was used of each 20 attempts. Please see
Figure 24 and
Table 4 for more details.
As is visible from
Table 4, finding a service placement for 14 nodes and 42 services requires a very small amount of consumed energy, which is around just 23 mW/h. This is barely considerable, but what requires more attention is the time itself. A response time of over 20 s is unacceptable in field conditions. However, the calculation response time can be improved by using a more powerful fog node such as PC. With the current complexity configuration and low-resource fog node, it can be still beneficial to use it with up to 6 nodes and 18 services with a response time of a few seconds. A bigger problem to address in this case would be keeping a mobile fog node running on a battery for a while when its power supply is limited.
6. Discussion
Different experiments were conducted with the prototype to determine the usability of the fog service placement orchestrator as a multi-agent system. The JADE platform was used to implement fog computing services as agents in Java language. A PC was used as a high-resource fog node, and Pi 4 stood for a low-resource node. As for an increased platform security, the JADE-S security add-on was integrated. A Raspberry Pi 4 stress package was installed for a stress test, and measurements were taken for energy consumption. Additionally, the inertia weight coefficient and a matrix size test were completed to determine the best performance choices.
The inertia weight, according to the publication [
63], demonstrates a clear balanced relationship between an exploration and an exploitation. It is possible to avoid a premature convergence by choosing the right inertia weight
w [
65]. The inertia weight of 0.6 showed the best results, since it leads, on average, to the lowest number of failures in finding the global optimum. Meanwhile, the number of criteria is used for a judgement matrix to highlight the importance of certain qualities such as security, CPU, RAM, and power. Four criteria were used for all other experiments but, as is visible from the results of the current study, even 30 criteria with 435 comparisons have a barely visible impact on a high-resource fog node performance. As for a low-resource fog node and a 30-criteria matrix, a service placement can be found in less than 1 s, which is still usable in field conditions.
It is possible to conclude based on the CPU stress test that at least 20% of CPU must be available for calculations. Otherwise, the response time increases two or three times. It is not that relevant if the number of particles is low, suggesting that the calculations are relatively simple, but it becomes critical with more complex ones. A CPU stress test has the biggest impact on the service placement phase in contrast to other ones that are relatively simple. When it increases from 1.1 s to 3.2 s, it may have a considerable negative effect on tasks that require a short response time.
The security add-on JADE-S did not add any clearly observed delay. Service placement decision making takes most of the time, but it does not require any agent intercommunication that is signed, encrypted, or both. Other processes are less complicated and less time-consuming. Some spontaneous time increases or decreases were noted but they were irregular and did not seem to play a significant role.
A USB tester and an energy meter were used to measure the voltage and current. Power or energy were measured or calculated as required. A self-const of the JADE platform calculations for 1 month would make only around 0.1 EUR, and the availability of Raspberry Pi 4 as a hardware would add another 0.4 EUR. However, mobility requires one to use batteries, and even such energy needs can be an issue when the capacity of the batteries is low enough.
The experimental results allow us to claim that this solution is applicable in a small fog-computing infrastructure. It offers dynamic and distributed decision making. A distributed decision-making process is more resilient than a central architecture due to the absence of a single point of failure. Any fog node can make decisions to launch, place, and eliminate relevant services. These decisions are synchronized afterwards. If there are any failures, services can be relocated. It is also possible to sign and to encrypt agent messages for additional security. Constant resource monitoring allows us to keep track of the available resources to make informed decisions.