An Intelligent Approach to Allocating Resources within an Agent-Based Cloud Computing Platform

: The cloud computing paradigm has the ability to adapt to new technologies and provide consistent cloud services. These features have led to the widespread use of the paradigm, making it necessary for the underlying computer infrastructure to cope with the increased demand and the high number of end users. Platforms often use classical mathematical models for this purpose, helping assign computational resources to the services provided to the ﬁnal user. Although this kind of model is valid and widespread, it can be reﬁned through intelligent techniques. Therefore, this research presents a novel system consisting of a multi-agent system, which integrates a case-based reasoning system. The resulting system dynamically allocates resources within a cloud computing platform. This approach, which is distributed and scalable, can learn from previous experiences and produce better results in each resource allocation. A model of the system has been implemented and tested on a real cloud platform with successful results.


Introduction
Cloud computing (CC) has undergone rapid growth since users began to notice its great advantage over traditional IT systems. This is because this paradigm facilitates the development of distributed computing systems, data management and computing resources through a scalable network, data processing centers and web services [1]. Hence, this technology is leading the revolution of distributed computing and has resulted in the rapid growth of private and public platforms [2][3][4][5]. There is no doubt that the general social acceptance of this paradigm [6] has largely contributed to its development, due to the economic interests of large technology companies that underline the purely technical aspects [7,8].
The marketing model used in the CC paradigm is also innovative, as it is based on a pay-as-you-go concept [7] for utility services such as water, gas, etc. In the utility computing paradigm [9], users are required to first negotiate and establish a service level agreement (SLA) to have access to the goods [10]. Having established a contract for the computing-based goods, both users (who pay fees on a regular basis) and the CC system (through service maintenance) are obliged to comply with the agreement. This marketing model contractually requires CC platforms to maintain the quality of services (QoS) as through the implementation of a case-based reasoning (CBR) system [23], a model which, to the best of our knowledge, has not been applied before in distributed systems with these characteristics. These reasoning systems use past experience to solve new problems; their intelligent model resembles human intelligence.
Consequently, we have developed a computational resource distribution model to be used in distributed environments, capable of managing resources according to past experiences, and of dynamically adjusting the resources allocated to each service. Using such a model makes it possible to achieve appropriate responses to demand and to increase the degree of effectiveness/efficiency of the solution and the state of the CC. Therefore, managing the functions of the nucleus of a CC system through an agent-based model and a CBR approach makes it possible to create a much more efficient, scalable and adaptable platform than the current ones.
This paper is structured as follows: the next section describes the context of the current work and the state-of-the-art approaches related to the presented one. Section 3 focuses on the MAS-based platform that supports the technical proposal. Section 4 presents the resource allocation algorithms applied in each agent, while Section 5 presents the evaluation and validation of the proposal. Finally, the last section draws conclusions from the conducted research.

Related Technologies and State-of-the-Art Approaches
Usually, in a CC environment, the term hardware infrastructure refers to the virtualized infrastructure [24,25], which means that there is a layer of abstraction between the actual hardware infrastructure and the computation nodes. In this way, each of the offered services is deployed in virtual nodes belonging to that layer of abstraction, and these are called virtual machines (VMs). Therefore, services are usually distributed and this distribution of services in the different nodes requires a system that provides the necessary load balance to distribute the requests among the different computer nodes that attend them.
Each VM has a set of dedicated hardware resources and is completely independent, making it possible to run different operating systems, where the software is completely independent. With this hardware in any given physical equipment, the virtualization process allows one to share and encapsulate the physical resources among a set of VMs by following a pattern similar to the master-slave model [26], in which the (real) server hosts different VMs. This capability, made possible by the virtualization technology, allows the computational resources in the virtualization layer to be considered unlimited resources, while the physical resources or hardware infrastructure are limited to the real infrastructure. The only disadvantage of this technology is that the management of the virtual environment requires a highly-specialized software called hypervisor [27]. This component is responsible for managing the hardware abstraction in each physical node, and this also consumes resources [28]. Nevertheless, this computational consumption depends on the selected virtualization model. Currently, the additional cost does not pose any significant problem as it does not exceed 2% of the computational power of the physical environment [24,29]. However, it is necessary to have dedicated hardware, which is also widely extended as a result of technologies such as INTEL-VT or AMD-V [25].
Virtualization techniques greatly simplify both the management and control of IT resources at the infrastructure level, supporting the dynamic creation or removal of VMs on demand or even enabling the migration of a VM from one physical machine to another at runtime, without stopping or pausing the machine or service. Therefore, thanks to virtualization technology, this complex problem is actually simple to solve, since it is only based on the efficient redistribution of physical (real) resources among the different computational (virtual) nodes. Due to its complexity and its contemporaneity, this problem has attracted the attention of the scientific community, which has led to the proposal of many different solutions.

Current Approaches to Allocated Computational Resources
There are two main approaches to resource distribution in the state-of-the-art systems [8]. The first one involves the search for the cheapest and most efficient provider in terms of resource usage. Thus, this approach follows a model in which a broker or resource manager, usually an external one, is in permanent and simultaneous contact with various providers, and selects the most appropriate one for each client at any given moment. The key technologies employed in this approach are service discovery, negotiation, etc. However, this approach is not the focus of this research and thus will not be considered any further.
The second approach is within the scope of this study, as it is concerned with the efficiency of resources within a CC provider's data centers. In this case, the problem can be contemplated from two different perspectives [30,31]. On the one hand, a large-scale CC resource provider with different distributed data processing centers throughout the world must process client requests according to two variables: the distance from the data center, and the current workload at each center. This makes it easier to reduce the latency of giving a response to clients. This model is known as a time-drive adaptive mechanism [31]. On the other hand, a small-scale resource provider is one that distributes resources from an individual data center, regardless of its size. The task of distributing resources is performed by a component usually referred to as the resource allocation system (RAS). Regarding this perspective, the state-of-the-art systems presents two major approaches [8]: • QoS-aware, or the market-oriented approach [8]. This approach is related to a customer-oriented resource and service distribution model, seeking to minimize computing risks in order to distribute computing resources, following the SLA and a pay-per-use model. In this model, computer resource management techniques aim to comply with these agreements at all times, thus providing the quality of service requested by the end user. In line with this approach, the state-of-the-art systems include proposals such as RAS-M [15], which is a model based on the market economy. Its aim is to redistribute resources that will lead to a fair market price.
Other studies address the distribution of resources on a mathematical basis, such as game theory [32,33], or the maximization of an optimization function, based on linear programming techniques, given that it simply involves the optimization of resources [34][35][36]. • Energy-aware approach [8]. In this second case, the distribution of resources considers both energy consumption and the pre-established SLA, which implies compliance with both. This approach has less published research because it is more recent. This includes a variety of techniques such as the application of energy savings policies in physical machines and VMs [16,37], the efficient redistribution of VMs according to the consumption of each physical machine [38], or models that are based on optimization techniques [17,39]. Each of these can be found in an incipient state of development, which makes their application in large computing centers more complicated.
In the light of the examination of the most recent studies, it is essential to build a model for the distribution of computer resources that will consider energy consumption as a key variable. The objective of including this variable is to seek to reduce energy consumption to the point that it satisfies the SLAs that have been established with the users of the environment. Moreover, all algorithms presented in this section follow centralized approaches, which use mathematical and economic theory models to provide the best suited algorithms depending on the type of problem. This research follows a novel approach, different from the usual ones, as it is based on AI and optimization techniques, which allows one to distribute resources following a distributed and scalable model, thus allowing the system to learn during a long period of time. To this end, the system is based on a MAS architecture, which makes it possible to interact among the different platform components and permits the introduction of advanced reasoning algorithms to facilitate the development of autonomous and distributed models able to dynamically self-adapt to changes in the environment.

Proposed Intelligent Model
On the basis of the analysis described above, where the strengths and weaknesses of the related work have been analyzed, we intend to propose a completely different model, based on CBR and with a MAS architecture to allocate resources and manage CC. The MAS architecture is called +Cloud (multi-agent system cloud) and it is designed specifically for the monitoring and control of a CC infrastructure. This MAS has been designed to leverage some of the most important characteristics of this type of system [39,40] (autonomy, pro-activity, intelligence, learning, organization, mobility, etc.) with the aim of implementing a distributed control strategy that allows for the design of decentralized algorithms for the management and control of cloud infrastructures.
+Cloud has been described at length in previous works [40,41] and it has been validated that it is an adequate system for the implementation of decentralized algorithms that allow for the allocation of computational resources [42]. Thus, this article is only going to focus on the contributions we add to this architecture. Specifically, a decentralized algorithm has been developed that incorporates a case-based reasoning model. This model gives priority to the distribution of responsibilities and to the limited knowledge of the system regarding each of the MAS components. Thus, this section describes the key components that allow to extend the operation of the elastic algorithms to resource distribution.
However, it should be noted that since the proposed MAS is a distributed system by its own definition, all the agents involved in resource distribution tasks may be located making efficient use of the CC environment. In other words, a distributed approach is used in the monitoring and control of the CC system. The cloud environment obtains the data from all of the CCs, both from the services it provides and from the infrastructure itself, thus having a distributed model oriented towards monitoring, which allows existing resources to be instantly adapted to the characteristics sought for in the CC environment. In this way, the demands of each service are satisfied in an agile way, meeting the objective of reducing energy consumption and SLA agreements. Figure 1 presents the primary agents in charge of the distribution of computational resources. The main difference between a CC environment and previous technologies is that it has the ability to offer the demanded services through a pay-per-use model [7]. +Cloud has been designed using the GORMAS methodology [43]. The model followed in the design of this MAS differs from the traditional control models employed in the development of this type of platforms, where decisions are usually taken centrally [8]. In this regard, monitoring and decision-making related responsibilities have been distributed throughout all the components of the platform (servers, services, etc.). Thanks to this model, it is possible to decide where the information is collected on the basis of local knowledge, which has allowed for the design of agile control processes based on uncertainty and interaction between peers.
In general terms, when a service k is demanded by a process or user, the service must fulfill it with the agreed SLA. This issue is controlled by the two types of agents associated with each service (service monitor and service supervisor) and with the sub-organization of resource consumption. The service monitor agent (SMA) is in charge of monitoring each of the services offered by the system, collecting data regarding the requests being made and measuring parameters of their quality, performance, errors, etc., in addition to having access to the demand history. There is an agent for each of the services offered by the CC and it is located in the node that balances the requests. The service supervisor agent (SSA) ensures that the previously established SLA agreements are being complied with, taking appropriate action in the event of detecting deficiencies. The SSA is also responsible for ensuring the high availability of the service, making sure that there are at least a certain number of nodes working in independent physical teams. traditional control models employed in the development of this type of platforms, where decisions are usually taken centrally [8]. In this regard, monitoring and decision-making related responsibilities have been distributed throughout all the components of the platform (servers, services, etc.). Thanks to this model, it is possible to decide where the information is collected on the basis of local knowledge, which has allowed for the design of agile control processes based on uncertainty and interaction between peers. In general terms, when a service k is demanded by a process or user, the service must fulfill it with the agreed SLA. This issue is controlled by the two types of agents associated with each service (service monitor and service supervisor) and with the sub-organization of resource consumption. The service monitor agent (SMA) is in charge of monitoring each of the services offered by the system, collecting data regarding the requests being made and measuring parameters of their quality, performance, errors, etc., in addition to having access to the demand history. There is an agent for The services are deployed on servers, which represent the system's computer resources. Three roles (local monitor, local supervisor and global manager) associated with the sub-organization that provides resources are also deployed here. The local monitor agent (LMA) is in charge of collecting data regarding the state of the local resources of each physical server-including its virtual machines-(PR i ) and transforming these data into adequate information for decision making. The local supervisor agent (LSA) is in charge of the control and distribution of the physical machine's computer resources, and is able to redistribute resources among the instances of execution, launch or shut down of virtual machines. Its objective is, therefore, the efficient management of the individual resources of the physical server between the different nodes it hosts, maximizing its use, but without diminishing the quality of the services being provided; and, ensuring that the physical machine always has the minimum resources to perform the tasks of coordination and control. Its work is carried out in close collaboration with the server's local monitor agent. The SLA and the local monitor have total knowledge of each individual server, but at the same time uncertainty regarding the rest of the infrastructure. The global manager (GMA) is the role in charge of making decisions about how to distribute the computer resources among several nodes of the CC platform, and not only at a local level as in the case of the LSA's role. In order to provide assistance to the decision-making process regarding the means of distributing resources, they use a partial knowledge base (provided by the LMA role) and past experiences stored in the usage history repository, which is achieved by relying on the CBR model [23]. The GMA's role is to determine how and with what characteristics the new service-associated virtual machines will be instantiated. Additionally, it is also responsible for the process of compacting the existing virtual machines into a smaller number of servers, so that it is possible to shut down physical servers and reduce energy consumption. Given that it is possible to stop and start execution nodes, it is also possible that these agents enter and leave the system dynamically, linking their life cycle to that of the server they monitor or control.
All these agents have a local knowledge base, but they also store historical information on a centralized nonSQL server, based on MongoDB. When the agents enter the system, they retrieve specific information about the history of the specific service or server where they are located. Concretely, there are three centralized repositories (resources, SLAs and historical), as can be seen in the structural view according to the GORMAS methodology given in Figure 2. This figure also shows the detail of the activity model according to GORMAS for the infrastructure control service.
The next section provides a detailed description of the reasoning model designed to allocate computational resources by means of agents within a distributed and heterogeneous environment where the level of uncertainty is high.
All these agents have a local knowledge base, but they also store historical information on a centralized nonSQL server, based on MongoDB. When the agents enter the system, they retrieve specific information about the history of the specific service or server where they are located. Concretely, there are three centralized repositories (resources, SLAs and historical), as can be seen in the structural view according to the GORMAS methodology given in Figure 2. This figure also shows the detail of the activity model according to GORMAS for the infrastructure control service.

Intelligent Model for Allocating Resources
The redistribution of resources is performed by the GMAs located in each physical server, which have greater authority than the LSAs and can request them to add/remove virtual nodes to a specific service k with specific characteristics. The GMA is a highly specialized agent which is in charge of providing the CBR architecture [23,44]. This reasoning process is performed simultaneously in all physical machines with available resources. In the last part of the distributed algorithm, a new VM, with specific characteristics of virtual CPUs (vcpu) and Memory (M) is instantiated to meet current demand (see Figure 3). The part associated with the reasoning of CBR at each physical node is based on the experience gained in storing similar cases. The knowledge base (KB)-or case memory-is shared with the whole CC environment; the global knowledge of the system can be shared with each of the GMAs. Since this memory can grow considerably as a maintenance strategy, a high-speed, schema-free database, based on MongoDB, is used to provide rapid access to the stored data.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 7 of 18 The next section provides a detailed description of the reasoning model designed to allocate computational resources by means of agents within a distributed and heterogeneous environment where the level of uncertainty is high.

Intelligent Model for Allocating Resources
The redistribution of resources is performed by the GMAs located in each physical server, which have greater authority than the LSAs and can request them to add/remove virtual nodes to a specific service k with specific characteristics. The GMA is a highly specialized agent which is in charge of providing the CBR architecture [23,44]. This reasoning process is performed simultaneously in all physical machines with available resources. In the last part of the distributed algorithm, a new VM, with specific characteristics of virtual CPUs (vcpu) and Memory (M) is instantiated to meet current demand (see Figure 3). The part associated with the reasoning of CBR at each physical node is based on the experience gained in storing similar cases. The knowledge base (KB)-or case memory-is shared with the whole CC environment; the global knowledge of the system can be shared with each of the GMAs. Since this memory can grow considerably as a maintenance strategy, a high-speed, schema-free database, based on MongoDB, is used to provide rapid access to the stored data. The reallocation process is initiated by the SSA associated with the service that is experiencing difficulties in responding to demand, as shown in Figure 4. This figure shows a hypothetical case of communication among the agents presented in Figure 3, where service k is deployed on virtual nodes hosted by three servers (PR1,2 and i) that have been represented as vertical lines. The servers host their agents and processes as described. The communication between agents and other processes is indicated by arrows. The aim of the communication is to find a solution-S(P)-to service demand.
When a service k is experiencing difficulties, the associated SSA alerts all physical machines (that host the nodes of the service) that a new allocation of resources is required to handle the demand, as shown in the messages m2 sent from the SSA to the remaining agents of the service in Figure 4. The message is received by the GMA from each physical server of the service, GMAs of PR1, PR2 and PRi The reallocation process is initiated by the SSA associated with the service that is experiencing difficulties in responding to demand, as shown in Figure 4. This figure shows a hypothetical case of communication among the agents presented in Figure 3, where service k is deployed on virtual nodes hosted by three servers (PR 1,2 and i ) that have been represented as vertical lines. The servers host Appl. Sci. 2020, 10, 4361 8 of 18 their agents and processes as described. The communication between agents and other processes is indicated by arrows. The aim of the communication is to find a solution-S(P)-to service demand.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 8 of 18 These agents ask the LMA of each machine to evaluate the level of usage of the internal resources. Each LMA generates a matrix ( ) with the information for the time t in order to evaluate the amount of available resources. Communication occurs between the GMA and LMA of each PRi, which implies the message sequence { , , , , , , , } for the case of PR1 in Figure 4. This snapshot structure gathers instantaneous information ( ) on the complete state of resource assignment at a specific time t in the resource PRi (See Table 1).  With this information, the GMA determines whether the physical machine can enter the decision process. If the amount of resources in each physical machine is greater than the minimum indispensable resources required to instantiate a node associated with the service, then the reasoning process is initiated to determine the amount of resources that can be reserved for each execution node as requested by the SSA for that service. Figure 4 shows the messages , and , sent by CBRs in GMAs from PR1 and PR2, respectively, to the SSA. However, if there are no available resources, or if there are fewer resources than those demanded by the service, the GMA determines that the physical node is not part of the global assignment process (see message , of unavailable resources sent from the GMA in PR3 to the SSA, Figure 4). If there is no physical machine that could respond to the increased service needs, a new physical server is requested to start up in order to instantiate a VM according to the minimum characteristics defined in the service level.
When the machine has enough computational resources (cases PR1 and PR2 in), the GMA initiates the process based on the definition of the concept of the following case: = { , ( ), } where: When a service k is experiencing difficulties, the associated SSA alerts all physical machines (that host the nodes of the service) that a new allocation of resources is required to handle the demand, as shown in the messages m 2 sent from the SSA to the remaining agents of the service in Figure 4. The message is received by the GMA from each physical server of the service, GMAs of PR 1 , PR 2 and PR i in Figure 4. The GMAs that received the message alert the remaining GMAs in the CC environment, as shown in the message m 2 sent from the GMA of PR 1 to PR 2 in Figure 4.
These agents ask the LMA of each machine to evaluate the level of usage of the internal resources. Each LMA generates a matrix (I t PR i ) with the information for the time t in order to evaluate the amount of available resources. Communication occurs between the GMA and LMA of each PR i , which implies the message sequence rq 1,1 , rq 1,2 , rp 1,3 , rp 1,4 for the case of PR 1 in Figure 4. This snapshot structure gathers instantaneous information (I t PR i ) on the complete state of resource assignment at a specific time t in the resource PR i (See Table 1). With this information, the GMA determines whether the physical machine can enter the decision process. If the amount of resources in each physical machine is greater than the minimum indispensable resources required to instantiate a node associated with the service, then the reasoning process is initiated to determine the amount of resources that can be reserved for each execution node as requested by the SSA for that service. Figure 4 shows the messages rp 1,5 and rq 2,5 sent by CBRs in GMAs from PR 1 and PR 2 , respectively, to the SSA. However, if there are no available resources, or if there are fewer resources than those demanded by the service, the GMA determines that the physical node is not part of the global assignment process (see message rq 3,5 of unavailable resources sent from the GMA in PR 3 to the SSA, Figure 4). If there is no physical machine that could respond to the increased service needs, a new physical server is requested to start up in order to instantiate a VM according to the minimum characteristics defined in the service level.
When the machine has enough computational resources (cases PR 1 and PR 2 in), the GMA initiates the process based on the definition of the concept of the following case: C = P, S(P), E where: • P represents the description of the problem, with a matrix-matched representation directly related to the instantiation of resource use, I t PR i , along with the description of the service level that is instantiated and the temporary indicator (timestamp) that identifies the instant in which the problem has been detected, where VM k t is the description of the minimum resources, in terms of memory M and vcpu, that are needed by the service.
• S(P) represents the solution to the P: S(P) = (M, vcpu) in terms of vcpu and memory.

•
Finally, E represents the efficiency which is measured from two perspectives: micro and macro.
Firstly, the efficiency at the micro level (E m ) is associated with the level of the efficiency of the solution proposed within the physical server where the VM has been deployed. The LMA proposes this level of efficiency according to the processor usage rates and the allocated memory.
where M used and M assigned are the used and total allocated memory, respectively. The efficiency at the macro level (E M ) is associated with the degree of efficiency from the point of view of the service and is calculated if the proposed solution requires the process of infrastructural resource distribution to be initiated at a macro level. In this sense, the degree of efficiency measures the number of additional nodes n required by the service.
Therefore, the efficiency is given by the following expression E = (E m , E M ) The CBR (Case-Based Reasoning) starts by retrieving similar cases from the case memory. The most similar cases are selected as explained below, and the formal algorithm is given in Algorithm 1:

1.
Select the cases with similar characteristics to the physical machines and a degree of efficiency greater than 90%. The similar physical machines are evaluated according to the benchmark parameter, which characterizes each physical machine.

2.
On the basis of this subset of retrieved cases, a vector C i is configured for each case that contains the same number of VMs that are in the case, and the free resources available Ci = (n, M, vcpu).

3.
The cases selected from this subset are those that had previously used the same service, and for a similar period of time. This is determined by analyzing a network usage pattern over a period of one week.
During the reuse phase, a solution to the problem is prepared, based on the cases that have been retrieved:

•
If the case base does not contain a similar past case, the solution to the problem is associated with the minimum resources determined at the level where the service is instantiated:

S(P) = (M min , vcpu min )
• If similar cases were recovered, the solution to the problem would be the case closest to the new one, multiplied by the efficiency of the most similar case: • In the case where there are not many resources available and the values assigned to the previous solution are higher than the values assumed by the machine, the result of the case would be the maximum amount of resources available in the machine.
When the solution to the case is calculated, it is sent to the SSA. This agent reactively chooses the node that would provide a higher amount of resources at the VM level. In the subsequent review stage, the new node is deployed, and its use is evaluated from both a micro and macro perspective. In this way, the efficiency of the solution is obtained. Finally, in the last step of the CBR, both the case and the value of the efficiency are stored so that they can be reused in future executions. The proposed adaptation model is distributed, which makes it possible to improve the high availability of the system, since the decision-making process is made throughout the entire CC system. Furthermore, this model can distribute the strength of the calculation, which requires obtaining the solution; as a result, the impact that the search for a solution has on the CC environment is reduced.

Algorithm 1.
Steps performed by the case-based reasoning algorithm involved in each global manager to allocate resources demanded by a service.

CBR Algorithm
INPUT: P, the description of the problem (case) to solve for a service k. OUTPUT: S(P), the solution to problem P. REQUIRE: KB: the knowledge base, d: dissimilarity metric of cases, E: compute the solution efficiency of a case, #vm: VM number for a case, ar: the available resource amount, ur: the used resource amount, Service: used service, T: service use period. ρ : dissimilarity metric of the service use period 01. % Retrieve all cases similar to the case with feature P in a 02. % set SC 1 (P) 03. SC 1 (P) := P i ∈ KB d(P, P i ) ≤ mp AND E(P i ) > 90 % where mp is a 04. % threshold (i.e., midpoint, mean, etc.) stating similarity 05. % between cases. 06. % Select all cases in SC 1 with the same number of virtual 07. % machines and with available resources. 08. Assign a vector v i := (n i , m i , vcpu i ) to each P i of the set: 09. SC 2 (P) := P i ∈ SC 1 (P)/#vm (P) = #vm(P i )AND ar ≥ ur (P) 10. % Select cases that have used the current service k and for a 11. % period similar to case P. 12. SC 3 (P) := P i ∈ SC 2 (P)/Service (P i ) = k AND p(T(P), T(P i ) ≤ t % where t 13. % is the threshold stated for period similarity. 14. If

Compaction
The proposed model uses the key characteristics of virtualization technology, which involves migration of machines between physical servers. In fact, this characteristic is very effective when compacting VMs into the smallest possible number of servers. This allows one to turn off/switch to sleep mode the set of physical servers that have not yet been assigned a machine, which significantly increases energy efficiency.
However, problems occur when the machines are widely dispersed, which can happen when a SSA associated with a specific service detects various nodes that are at the highest priority level (they have a high quality of results) and requests the random elimination of one of these nodes.
The compaction process is simple and is applied by the specialized GMA as follows:

1.
A virtual server with a very low number of VMs (the number is determined by the system administrator) asks other servers to host its VMs so that it can go into sleep mode and not use any resources.

2.
The nodes with available resources at the time of the request evaluate the snapshop (I t PR i ) provided by the LMA to determine whether it can host another machine with the given characteristics.

3.
If there are resources available, the configuration is sent to the GMA that has made the request, and the migration process is initiated. If this agent receives various confirmations simultaneously, it randomly initiates the migration process with one of the machines. The process is random since the agent does not know all the internal details of the machine that hosts the new virtual node. This simple process makes it possible to compact the set of VMs without affecting the quality of service, since the individual resources of each machine are not modified. The system then goes into a compact state.

Evaluation of the Proposed Model
In order to evaluate and validate the model proposed in this article, a CC platform designed and developed by the BISITE research group (http://bisite.usal.es.) has been used. This CC platform was deployed in the HPC (High Performance Computing) environment, which offers numerous services and is composed of 15 machines that support virtualization in hardware with the use of Intel-VT technology and the KVM (Kernel-based Virtual Machine) virtualization system. MAS is implemented using the Python programming language because of its power, ease of maintenance and flexibility in handling data structures. In this regard, the data exchange format is based on JSON (JavaScript Object Notation). The web service layer has been implemented using the Tornado web development framework because of its ability to manage a large number of client connections.
With regard to the distribution of the initial resources (see Figure 5), a Cloud service-the file storage service-is deployed in different virtual nodes (VM 1 and VM 2 ), each one hosted by a different physical machine (PR 1 and PR 2 , respectively). The result obtained with this deployment is that the service is highly available (deployed in two servers), and it is also deployed in two physical machines with different computational loads, something that happens in real environments, since the two physical machines host other virtual machines that correspond to other services of the CC platform. In other words, the physical server PR 1 has many available and unallocated resources, while PR 2 has no available resources and the machines it hosts have a high computational load.
An amount of 10 to 40 threads were launched which, every 3 s, consulted specific methods of the service (GetSize and GetFolderContent). The acceptable QoS level for the GetSize function in this experiment remains at 1.5 s, while the threshold QoS level for GetFolderContent is set at 0.5 s.
The process starts once the Service Supervisor detects a decrease in performance, at this time it directly executes the adaptation process. This type of adaptation occurs when the demand for the service is much greater; an increase in the computational load results in a more rampant increase in the load. The process for exchanging messages among agents during the adaptation of the infrastructure is shown in Figure 6. The specialized Service Supervisor agent that initiates the service also sends an alert (step 1, Figure 6) to the Global Manager agent for each of the physical machines that host the service nodes. It should be noted that the GMA is a specialized agent using a CBR reasoning process [23,44] in order to allocate resources at the macro level. After receiving the first alert, the agents forward the alert message to the rest of the GMA in the CC (step 2, Figure 6).
Appl. Sci. 2020, 10, x FOR PEER REVIEW 12 of 18 Figure 5. Evaluation of the case study-status during the initial period.
The process starts once the Service Supervisor detects a decrease in performance, at this time it directly executes the adaptation process. This type of adaptation occurs when the demand for the service is much greater; an increase in the computational load results in a more rampant increase in the load. The process for exchanging messages among agents during the adaptation of the infrastructure is shown in Figure 6. The specialized Service Supervisor agent that initiates the service also sends an alert (step 1, Figure 6) to the Global Manager agent for each of the physical machines that host the service nodes. It should be noted that the GMA is a specialized agent using a CBR reasoning process [23,44] in order to allocate resources at the macro level. After receiving the first alert, the agents forward the alert message to the rest of the GMA in the CC (step 2, Figure 6). The next process is carried out in parallel in each physical node of the CC system. Each GMA that has received an alert message requests the LMA of the machine in which they are located to instantiate the state of the system ( ). The agent uses this information to determine if there are available resources to instantiate a new node associated with the service requesting resources. If the Figure 5. Evaluation of the case study-status during the initial period.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 12 of 18 Figure 5. Evaluation of the case study-status during the initial period.
The process starts once the Service Supervisor detects a decrease in performance, at this time it directly executes the adaptation process. This type of adaptation occurs when the demand for the service is much greater; an increase in the computational load results in a more rampant increase in the load. The process for exchanging messages among agents during the adaptation of the infrastructure is shown in Figure 6. The specialized Service Supervisor agent that initiates the service also sends an alert (step 1, Figure 6) to the Global Manager agent for each of the physical machines that host the service nodes. It should be noted that the GMA is a specialized agent using a CBR reasoning process [23,44] in order to allocate resources at the macro level. After receiving the first alert, the agents forward the alert message to the rest of the GMA in the CC (step 2, Figure 6). The next process is carried out in parallel in each physical node of the CC system. Each GMA that has received an alert message requests the LMA of the machine in which they are located to instantiate the state of the system ( ). The agent uses this information to determine if there are available resources to instantiate a new node associated with the service requesting resources. If the machine, in response to the instantiation, detects that the physical server does not have any available The next process is carried out in parallel in each physical node of the CC system. Each GMA that has received an alert message requests the LMA of the machine in which they are located to instantiate the state of the system (I t FR ). The agent uses this information to determine if there are available resources to instantiate a new node associated with the service requesting resources. If the machine, in response to the instantiation, detects that the physical server does not have any available resources, it does not perform any action. This process is conducted in parallel for each of the machines that have available resources.
This problem description is used to retrieve similar cases from the case memory. Each GMA individually determines which resources can relinquish to the new node that has just been instantiated. This new provision of resources is itself the solution to the current problem, S(P). The solution provided by each of the CBR agents from each physical server with available resources is then sent to the Service Supervisor agent that initiated the process because of its own performance difficulties (step 3, Figure 6). Once it has received the set of proposed solutions from the different CBR-BDI agents, the agent overseeing the service sends an acceptance message to the GMA that offers the greatest amount of resources for the new node that must be instantiated (step 4, Figure 6). Finally, the GMA that receives the request asks the LMA (step 5, Figure 6) from its machine to instantiate a new virtual node on the level of the VM associated with the service, according to the proposed problem solution.
Once the new execution node has been launched, it is then necessary to evaluate the proposed solution. The LMA evaluates the solution according to the degree of underused resources in the node that has just been instantiated. If a new resource distribution process must be conducted at the macro level, the SSA completes the evaluation performed by the LMA in order to penalize the solution. In both cases, the efficiency of the proposed solution is evaluated according to the amount of underused resources of the new instantiated node.
Results in terms of QoS are represented in Figures 7 and 8, showing also the increase in quality once the adaptation is finished. resources, it does not perform any action. This process is conducted in parallel for each of the machines that have available resources.
This problem description is used to retrieve similar cases from the case memory. Each GMA individually determines which resources can relinquish to the new node that has just been instantiated. This new provision of resources is itself the solution to the current problem, S(P). The solution provided by each of the CBR agents from each physical server with available resources is then sent to the Service Supervisor agent that initiated the process because of its own performance difficulties (step 3, Figure 6). Once it has received the set of proposed solutions from the different CBR-BDI agents, the agent overseeing the service sends an acceptance message to the GMA that offers the greatest amount of resources for the new node that must be instantiated (step 4, Figure 6). Finally, the GMA that receives the request asks the LMA (step 5, Figure 6) from its machine to instantiate a new virtual node on the level of the VM associated with the service, according to the proposed problem solution.
Once the new execution node has been launched, it is then necessary to evaluate the proposed solution. The LMA evaluates the solution according to the degree of underused resources in the node that has just been instantiated. If a new resource distribution process must be conducted at the macro level, the SSA completes the evaluation performed by the LMA in order to penalize the solution. In both cases, the efficiency of the proposed solution is evaluated according to the amount of underused resources of the new instantiated node.
Results in terms of QoS are represented in Figures 7 and 8, showing also the increase in quality once the adaptation is finished.  Subsequently, another set of experiments has been conducted for the distribution of infrastructure resources. In these experiments, we intended to perform numerous consecutive executions in the adaptation process, because a single execution of the proposed algorithm to perform the adaptation automatically could not satisfy the demand for the services. The result of the tests carried out from the point of view of the adaptation model was positive since it showed that it performed correctly, always within the limits of the case study and keeping the SLA level within the established limits. Subsequently, another set of experiments has been conducted for the distribution of infrastructure resources. In these experiments, we intended to perform numerous consecutive executions in the adaptation process, because a single execution of the proposed algorithm to perform the adaptation automatically could not satisfy the demand for the services. The result of the tests carried out from the point of view of the adaptation model was positive since it showed that it performed correctly, always within the limits of the case study and keeping the SLA level within the established limits.

SLA Agreement
It is not possible to make an empirical comparison of the proposed model with other existing approaches in the state-of-the-art systems, as it is difficult to recreate the computer and/or simulation environments in which they have been evaluated. However, it is possible to make a theoretical comparison of the proposed approach with respect to other existing works in the state-of-the-art approaches. Firstly, it is observed that the proposed model follows a distributed approach to solving the problem, which totally differentiates it from existing works in the state-of-the-art approaches [16,38]. This approach, which has been shown to be valid for the distribution of computational resources in this type of CC environment, has advantages with respect to their availability, since there is no single component in charge of the distribution of resources, but rather the system (society) itself is reorganized as a whole through the individual adaptation of its components (agents).
In the state-of-the-art approaches, as detailed in particular by Goudarzi and Pedram [45], the execution of allocation algorithms is a complex task that requires a large amount of computational power and time. However, the proposed model simplifies the search for an adequate solution to the problem, as (i) the computational needs are distributed among different nodes; (ii) the value space to be considered is smaller since each node must only consider the data regarding its own resources and, furthermore, it is not necessary to have global knowledge of the platform; and, finally, (iii) each SLA Agreement It is not possible to make an empirical comparison of the proposed model with other existing approaches in the state-of-the-art systems, as it is difficult to recreate the computer and/or simulation environments in which they have been evaluated. However, it is possible to make a theoretical comparison of the proposed approach with respect to other existing works in the state-of-the-art approaches. Firstly, it is observed that the proposed model follows a distributed approach to solving the problem, which totally differentiates it from existing works in the state-of-the-art approaches [16,38]. This approach, which has been shown to be valid for the distribution of computational resources in this type of CC environment, has advantages with respect to their availability, since there is no single component in charge of the distribution of resources, but rather the system (society) itself is reorganized as a whole through the individual adaptation of its components (agents).
In the state-of-the-art approaches, as detailed in particular by Goudarzi and Pedram [45], the execution of allocation algorithms is a complex task that requires a large amount of computational power and time. However, the proposed model simplifies the search for an adequate solution to the problem, as (i) the computational needs are distributed among different nodes; (ii) the value space to be considered is smaller since each node must only consider the data regarding its own resources and, furthermore, it is not necessary to have global knowledge of the platform; and, finally, (iii) each node can autonomously apply a partial solution to the problem, eliminating the coordination needs at the global level of the platform.
Finally, the clear advantage of the proposed model with respect to other works is its capacity for learning. In the macro-level infrastructure distribution model, it has been used as an adaptation algorithm based on a case-based reasoning system that is integrated into a specialized agent in the +Cloud organization. Thanks to this approach, as has been demonstrated, it is possible for the system to learn from past experiences, achieving greater efficiency in adaptations as it learns, memorizing both positive and negative experiences. Among the state-of-the-art approaches, there is no similar approach in which the distribution of computational resources is based on the results obtained in past adaptation processes.

Conclusions
This research is one of the first to propose the use of a MAS in the framework of control and surveillance systems in CC environments. The main result of this work is the proposal of a new architectural model based on a MAS with a clearly integrating purpose. To achieve this, a set of algorithms for the distribution of computing resources in CC environments has been developed, evaluated and assessed. The main innovation of the proposal lies in the dynamic capacity of the system to adapt autonomously to demand and to learn from previous experiences.
With the new proposal, it has been possible to demonstrate that a control and surveillance system in CC environments can be designed using MAS as the basis, and it is also the key element of the architecture. This is clearly due to the distributed nature of MAS, which makes it possible to implement elastic algorithms to support the services that require distribution. Another key aspect is the distribution of responsibilities. Thanks to the use of this type of algorithm, it is possible not only to make decisions in the place where the problems actually arise, but also to distribute the necessary calculation capacity to obtain an efficient solution among all the instances that comprise the CC environment. This type of solution directly contradicts the way in which current centralized models solve the problem of elasticity. As this study has demonstrated, the use of a distributed approach is undoubtedly a viable option that should be considered when designing elastic algorithms.
This strategy ensures that each process is independent when making decisions in the software layers where many actions are running. It is undeniable that if the capabilities offered by the underlying technology change, it will also require that certain aspects be changed in the reasoning models used, just as would be done in any traditional strategy. However, in this case, the MAS architecture overcomes these challenges; thanks to its adaptive design, it is possible to modify the individual agents that carry out specific actions in the MAS. This approach ensures independence between the software layers in which decisions are made and those in which the decisions are executed. In a CC environment, such separation of responsibilities is particularly important because today's platforms are highly dependent on the technological environment (virtualization tools, load balancers, distributed file systems, etc.). This dependence constitutes a great limitation and hinders the platform's evolution, since a change in some of the hardware or software components also makes it necessary to alter the algorithms and techniques that make the system elastic. In the proposed MAS, implementation is made using communication ports according to the GORMAS methodology. Thanks to this methodology, dependence on the environment is limited to the port itself, i.e., interface for communication with the environment. There is no doubt that a change in the capabilities offered by the underlying technology would also make it necessary to modify the proposed reasoning models, as in a traditional design approach. However, organizational models also offer an adequate response to this difficulty.
In a system such as that being proposed, the agents who are part of society are given one or more specific roles. Each role is an abstract definition of the objectives, responsibilities and privileges of the individual who assumes it. Following this high-level definition, in the case of the introduction of new technological capacities, the work of adaptation would consist in modifying the individual or individuals who carry out the concrete tasks and those who play roles in the organization. Therefore, since it would not be necessary to alter society as a whole and only the individual entities; the proposed architecture is also above other existing platforms in the market.
The proposed model has been designed to address the problem of excess energy consumption by proposing solutions that consider the degree of efficiency. As part of the problem, this study proposes a compaction model of VMs. This model makes it possible to define the problem of excess energy consumption within the framework of the CC platform. However, the learning capability of the proposed adaptation model is undoubtedly the key characteristic that can cogently improve the existing state-of-the-art approaches: while current models use mathematic algorithms and statistical systems, the proposed system develops specific solutions to a given problem on the basis of the degree of efficiency that the same or similar solutions have had in the past. Given that the system has learning capacities, its response and ability improve over time. Thus, this approach can continually increase the efficiency of the solutions it proposes. Moreover, the system's learning abilities are extremely important in an environment where there is some uncertainty, as is the case of CC. This is because when the context or environment of the CC platform changes at any given time, the adaptation model must likewise evolve and adapt all the proposed solutions so that the efficiency of the proposed solution is maximized.
In conclusion, the use of MAS enables us to continue researching techniques, tools and methodologies that will have intelligent characteristics, such as autonomy and pro-activity. In this regard, future lines of research will be proposed to continue extending the capacities of the proposed system, firstly, by incorporating the capacity provided by containers to improve the granularity in the distribution of resources and to achieve a much more precise resource allocation. Likewise, we also intend to extend the capacities of the MAS to distribute computing responsibilities beyond the cloud computing environment, within the framework of the edge computing paradigm, so that it can be integrated in an Internet of things environment and communication can be carried out at the edge.