1. Introduction
Cloud computing, and its component technologies, are receiving a great deal of attention, and a lot of research has been carried out since the moment they were released. There are three main service models in cloud computing: Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), and Software-as-a-Service (SaaS). The essential elements of cloud computing, defined by the National Institute of Standards and Technology (NIST), are on-demand self-service, broad-networking access, resource pooling, and rapid prototyping infrastructure, as a software model in Mell and Grance [
1]. Virtualization software emulates different hardware processes in a virtual machine, and this could result in performance degradation compared to physical servers that are not virtualized. Nevertheless, server virtualization in the cloud environment has an advantage in terms of system management; it can physically consolidate several servers into one, reducing the footprint, or move virtual machines to other servers.
IaaS is a cloud service that provides an infrastructure environment that is highly available without the need to purchase computing resources or equipment. The hardware resources are virtualized and are being offered to potential customers. The user is not paying any hardware infrastructure and maintenance costs, but only an operational cost due to the use of virtualized resources, which are controlled by another party. The user can acquire virtualized resources on demand from the web by exploiting a certain service that is offered by a particular endpoint, but that service is not necessarily centralized. Rapid elasticity, as one of the main advantageous points of a cloud computing environment, enables the user to allocate infrastructure resources to meet the usual incoming traffic and automatically allocate additional infrastructure resources to meet demand when traffic is heavy beyond a specified threshold. On the other hand, if traffic is kept below the lower threshold after a certain period of time, unnecessary resources are cleaned up to keep infrastructure resources to a minimum.
Early on, cloud computing technologies focused on optimizing IaaS, for instance, using the Amazon Web Service (AWS). PaaS is also the core service model for cloud computing, providing services to automate the deployment and management of applications and enabling multiple middleware services. Middleware services are indeed offered in the environment and can be prepared for the user; this would mean that the user can focus more on developing the core functionality of his/her application. The potential for PaaS cloud computing is highly appreciated. The SaaS model virtualizes everything and can be raised to the level of the actual functionality. While, in the case of PaaS, the environment is virtualized, the user still needs to develop the core functionality. In fact, PaaS builds on IaaS, and SaaS might build on PaaS or IaaS. Software with SaaS is provided as a ready-to-use service, such as Adobe Creative Cloud and Microsoft Office 365. Especially, the open source PaaS development and scientific application based on PaaS is the nascent market [
2,
3]. By comparison of the three types of cloud service models with transportation [
4], IaaS is like leasing a car that the client can choose and drive wherever they like, and PaaS is like taking a taxi; the client does not need to drive it himself but rather need only tell the driver the destination. SaaS is like going by bus, which has some routes, and the ride is shared with other clients.
Cloud computing technologies are also being utilized for geo-spatial image services with data distribution and processing purposes in distributed environments containing data servers and processing programs. Yang et al. [
5] analyzed cloud technologies, currently in commercial and open source services, from the IaaS perspective, and they reviewed them for practical application to obtain geo-spatial information. Kang [
6] performed a performance evaluation regarding the auto-scaling of geo-based image processing in an OpenStack environment to reveal advantageous aspects of the IaaS cloud, such as cost savings, scalability, and flexibility, and high availability [
7]. Recent applications of service systems have to increasingly cope with both larger, diverse issues and workload volumes, but they are requested to operate at minimum costs and maintain desired performance guarantees in Costache et al. [
8]. Wu et al. [
9] evaluated the performance of finite element analysis on several public clouds by solving a large-scale engineering problem using performance metrics, including elapsed time, speed-up, scalability, and stability. Duan [
10] emphasized that the performance of the cloud-based service has a significant impact on future information infrastructure. IaaS-based geo-spatial data processing services with actual image processing functions, such as change detection, segmentation, and normalized difference vegetation index (NDVI) under an OpenStack environment, were presented in Lee et al. [
11]. According to the meta-analysis of Senyo et al. [
12], the majority of cloud computing researchers carried out an experiment or simulation, as methods of inquiry, rather than theoretical frameworks and models. Yue et al. [
13] provided the evaluation results that geoprocessing cloud services could be elastic and cost-effective by a comparative analysis of those of two cloud computing platforms, such as Microsoft Windows Azure and the Google App Engine. Noor [
14] suggested future directions for mobile cloud computing. If web services or software applications, deployed or developed via the use of PaaS services, can be provided, the system experts can easily deploy and develop such components. However, cloud application services using PaaS in geospatial data application fields are just in the early stage of development. There is a need for investigating the benefits of geospatial application deployment or development via PaaS, as well as the closure of some gaps or issues related to these two tasks, either in general or with respect to the use of IaaS services. In this study, the PaaS-based service linked with OGC WPS 2.0 was built and tested on the basis of the performance results of an edge extraction algorithm, in three cases of 300, 500, and 700 threads, against IaaS-based service with the same function.
The remainder of this article is structured as follows.
Section 2 briefly reviews the IaaS- and PaaS-based service models.
Section 3 describes how the open-source geospatial image service has been realized via the PaaS-based technology.
Section 4 presents the results of performance tests, in terms of response time and error rate, on both IaaS and PaaS cloud computing environments with OGC WPS 2.0.
Section 5 discusses the results and
Section 6 concludes the paper.
2. IaaS- and PaaS-Based Application Services
To deliver services at any level or service model in the cloud, the first step is to select the optimal solutions for a cloud computing service if a client wants to use either commercial ones or free open source ones. In general, the cloud services provided by commercial vendors may be more convenient than open source solutions because commercial ones make it easy to access, maintain, and implement a cloud-based service as well as security checking. Rodero-Merino et al. [
15] surveyed the security of multitenant software platforms to check PaaS security issues, such as the access control mechanism, block by synchronized static components, thread termination, and resource accounting of different layers. Kritikos et al. [
16] studied a security-enhanced PaaS platform for multi-cloud applications and concluded that pragmatic results should be considered to select the technologies applied to build and manage IaaS and PaaS systems for performance demanding cases. There is the benefit of a direct deployment of cloud-based services, defined by the client, which focus on specific areas because they can implement specialized functions and features in cloud technologies. However, this requires a great deal of skill and experienced knowledge, as they are built directly from the hardware configuration for service deployment. Several open source solutions to building IaaS- and PaaS-based cloud computing service models are already available [
17,
18,
19]. An application service is built through services offered by each service model. Kim and Lee [
20] studied factors, from hardware components to middleware, in relation to designing an application service for geo-based data processing. Kozhirbayev and Sinnott [
21] presented a comparison relating to the memory, network bandwidth, and storage overhead performances of container-based technologies in the platforms, such as native, Docker, and Flockport for the PaaS cloud. According to their experiments, the input–output latency is exacerbated by these overheads. The CPU cycles, required for utility, can cause performance degradation. Containers are one of the building blocks for PaaS, which allow clients to create applications on any virtual or physical infrastructure, combining all its dependent elements, such as codes, executables, and system libraries.
There are nine layers of technology for IaaS and PaaS service models, showing computing layers of hardware and software needed to implement IT services, which are illustrated in
Figure 1. Layers, such as storage, server, networking, virtualization, runtime, middleware, application, and data, are all necessary to build web services in a cloud environment, and there are many ways to implement them, depending on the purpose or user of the service. IaaS usually provides hardware, such as servers, networks, and virtualization. This allows developers to easily configure the physical portion of hardware through virtual servers (VSs). Virtual servers logically divide one physical server and allocate resources, such as CPU and memory. Each virtual server can operate a separate OS or application. Developers need to install the required runtime components and middleware software, databases, and web servers inside one or more of the provided instances and upload the application processor, which represents the processing function and application view of user interfaces, as well as the required data for operation into a PaaS-based service. Since software services differ in different applications, functional analysis is necessary before providing any functionality in a cloud-based system. A stack of provided software components is the runtime environment in which the application developed by the developer runs, and the middleware software is required when the application works. In PaaS environments, it is converted to packs and services. This enables cloud clients or operators to operate the application view as soon as they upload the application processor and required data to PaaS.
A basic architecture of actual open source solutions for geo-spatial information services in the cloud computing environments of IaaS and PaaS, respectively, is represented in
Figure 1. This can be extended to SaaS services, which provide the geo-spatial image processing services directly to the Web, if additional features based on user requirements are added, and datasets for processing features are available. This schematic diagram depicts the main parts that the developers need to consider in providing cloud-based geospatial services. As can be seen, PaaS contains a software stack, as a middleware layer, composed of open source solutions for geo-spatial data processing. This study also considered the interoperable standards, provided by OGC WPS, of geo-spatial image processing services in the Web Processing Service (WPS) [
22]. The WPS standard defines how to request the execution of a process, and how the process output should be handled. In particular, the Zoo-Project, which is a WPS implementation, programmed in C, Python, and JavaScript, provides a framework for creating and chaining WPS compliant web services, as well as a standard interface for OGC WPS 2.0, which consists of five standard interfaces, such as GetCapabilities, DescribeProcess, Execute, GetStatus, GetResult in Zoo-Project Team [
23]. Yoon et al. [
24] conducted a performance test of satellite image processing for OGC WPS 2.0, running on Zoo-Project in the OpenStack IaaS cloud, and showed test results for the other three requests, excluding GetStatus and GetResult, finding that the average response time increases rapidly as the number of users increases. Yoon and Lee [
25] carried out a performance comparison test of a non-WPS case, synchronous process case of OGC WPS 1.0, and asynchronous process case of OGC WPS 2.0 with two kinds of image processing algorithms, such as feature extraction and gradient magnitude computation, and presented the result that WPS 2.0 showed a high performance in the response time.
All middleware software or components, such as GeoServer, Zoo-Project, and the database management system, should be used, as instances, directly on the IaaS service. PostgreSQL [
26], the open source database management software, serves the client by retrieving the processing status. The image database was not built in this study. Actual image processing was conducted by image files, and metadata of the image data, used in the experiment and information of the processing status, was stored to communicate with OGC WPS interfaces in this database management system. GDAL [
27] is an open source software that helps to clip the processing zones and convert to an image on the Web after it has been processed using remote sensing algorithms in the Orfeo Toolbox [
28]. Clipping is a function for extracting only those images that fall within a certain range from rectangular image data. GeoServer [
29] was employed to manipulate the satellite image dataset and visualize the processing results. There are a number of things to consider, when applying PaaS, in order to make the actual on-demand approach. An app, which is a small application, developed by installing the web server and runtime environment, could be distributed using IaaS, PaaS or both types of services with IaaS. Since PaaS offers a web server and runtime environment, developers use the PaaS to build the right runtime environment and deploy the application in that environment and from the Web. The way to use virtual machines as hypervisors is to emulate hardware on top of the hypervisor, logically creating separate systems. On the other hand, the container arrangement provides an independent application performance environment by loading the required kernels and libraries. The container approach has advantages in terms of resource efficiency, although it is not independent of resource use, making it difficult to measure the amount of infrastructure resources. Application services of IaaS are deployed on a per-instance basis, whereas those of PaaS are deployed on a container-wide basis.
4. Results of Performance Tests
Performance evaluations of the ‘DescribeProcess’ and ‘Execute’ requests, among five OGC WPS 2.0 requests, such as GetCapabilities, DescribeProcess, Execute, GetStatus, and GetResult, were carried out. The results of performance testing for WPS, executing the operation in an IaaS environment without LBaaS and a PaaS environment by varying the number of threads, are shown in
Figure 4,
Figure 5 and
Figure 6. The mean values of the results of the three experiments for the ‘DescribeProcess’ request are presented in
Figure 7. In
Figure 4,
Figure 5 and
Figure 6, the horizontal axis, representing the ramp-up period, shows the elapsed time in seconds. An experiment was performed, varying the ramp-up period by one hour. Since multiple experiments with different threads and variables could change the networking status and hardware conditions, instances on IaaS and apps on PaaS were adjusted to their initial state whenever each experiment was carried out in order to perform the experiment under the most similar environments possible. The vertical axis shows the response latencies, indicating the response time of each thread request in milliseconds. Parallel processing through HAProxy for adequately distributing the processing resources to the instances and log recording in the Zoo-Project confirmed that all three instances were being used for processing. Both IaaS and PaaS services represent that the response time increased over time, owing to a gradual increase in the load on the WPS servers, and PaaS services are more responsive than those running on IaaS.
The asynchronous processing scheme, supported by OGC WPS 2.0, was used, which immediately sends an answer to the client when a request is made to execute. The possible asynchronous calls might prevent the WPS service from having very long connections, as well as clients from being blocked in waiting for an answer. Generally, an asynchronous processing scheme is applicable to any condition in which linking to local resources while a remote request is being processed is not desirable. Server virtualization virtualizes the entire hardware environment, while containers virtualize the application execution environment. Each container can run applications without affecting other containers, and containers reduce performance degradation because of their low virtualization overhead. Since the Zoo-Project performs the actual processing, there is no actual burden on the geo-spatial image processing service. When an execute processing is requested, a jobID that responds to that request is sent to the service app and is stored, then, the processing status is checked periodically until the experiment is finished. As such, the more concurrent accesses, the higher the response to an execution request becomes.
Response latencies for 300 threads, for IaaS without LBaaS and PaaS, are shown in
Figure 4. The result for 500 threads, in which PaaS shows almost the same trend as that in the case of 300 threads, but the response latencies for the IaaS service without LBaaS are higher than in the case of 300 threads, is shown in
Figure 5. The result for 700 threads is shown in
Figure 6. Moreover, the rate of growth of the IaaS service without LBaaS is much higher than that of one of PaaS services. The response latencies of PaaS are relatively stable, while it seems that they become doubled.
The average value of the five performance measurement experiments of each multi-thread, with 300, 500, and 700 threads, is shown in
Figure 7. This chart also shows the error rate of responses to the OGC WPS 2.0 ‘describe’ request. On requesting the describe, the processing name as a parameter was sent, and the response result was the data path and the channel variable value for the edge detection algorithm applied in this experiment. These variables were entered into the execute request to perform the actual data processing. Unlike in the case of the execute request, the result of the OGC WPS 2.0 describe request was a single operation that disappears from the memory when a request is responded to. The average performance time required for the describe operation is considerably less than for the execute operation, as shown in
Figure 7. In sum, the average of the execute response rates for PaaS was significantly lower than that for IaaS, as shown in
Figure 7. Contrary to the execute operation case, the describe results show that the response to IaaS without LBaaS is slightly faster than that to PaaS. The reason is as follows: The app running on the IaaS cloud was configured as a single instance of m1.medium with two vCPUs and 4 GB of RAM, and the apps on the PaaS cloud consisted of four containers, each with 1 GB of capacity in the instance of m1.xlarge, with eight vCPUs and 16 GB of RAM. Assuming that a single request is given in an instance of an IaaS environment and a single container of a PaaS environment, a ‘describe’ request can respond quickly in higher-performance configurations of the IaaS.
The error rate results, which were obtained through the table of the summary report of the JMeter tool, are shown in
Table 4. Error rates which represent failure rates are expressed as a percentage regarding the result of incomplete or unclear responses to requests. Due to the asynchronous processing functionality of WPS 2.0, the response to the request is successful, but not necessarily the completion of the process. When there is too much concurrent processing or application overloading, the execute response value may fail. In error rates for the PaaS service, both the describe and execute operations are found to be less than 0.4% of any threads. This means that 28 requests of 70,000 requests by 700 threads have failed. Error rates of IaaS without LBaaS are higher than those of PaaS and are highly proportional to the number of threads. One of the reasons is that the response time is greater and, thus, the load becomes greater over time. This shows that as the load gets higher, when the number of threads is increased, the error rate also gets higher, except in one case of a PaaS execute with 500 threads. The maximum error rate rises to about 1.8% with 700 threads, which is a failure of 126 times. In comparison, the error rates for IaaS are 4.7 times higher than those for PaaS.
5. Discussion
The uses of open source cloud platforms of developing services that process geo-based images in a web environment are in the early stages. Therefore, experimental studies on the applicability and extensibility of cloud computing over geo-spatial processing are needed. This is the motivation for this research. Among many possible research topics, the focus in this study is to evaluate whether it is better to build a certain PaaS product over a PaaS or IaaS service using the generic variables, such as a multi-thread, ramp-up period and loop count, as well as the measurement of the error rates, regarding the completeness of requests in the cloud service.
The image processing function selected for the experiment is an edge extraction algorithm that performs a simple function, which does not require other functions linked with minimum user input variables. The image processing function applied in this study is one of the most frequently used satellite image processing functions. Since both IaaS and PaaS cloud services are basically web-based, OGC WPS 2.0 standards were applied in the experiment for performance evaluation. As for the experiment processes and results affecting the difference between the vendor and implementation, this experiment excluded, as much as possible on a technical basis, the influencing factors from these. Comparing the commercial systems with the open source systems is not the main scope of this study. This experiment did not apply LBaaS optional load balancing. If LBaaS were applied, it would have contributed greatly to improving the performance in the IaaS cloud. This study conducted experiments under limited conditions and simple variables. Thus, further experiments are needed to measure performance under more practical conditions, using large amounts of data in distributed servers to indicate which cloud environment becomes an effective way. For instance, a study comparing the performance of the PaaS cloud to the service applied by IaaS with LBaaS will also produce meaningful results, as in the case of applying open source tools, including Cloud Foundry and OpenStack. An experiment is necessary to identify how the optional LBaaS service applied to IaaS can contribute to improving performance when it is linked with a PaaS load balancing scheme.
6. Concluding Remarks
Cloud computing can be utilized in a vast range of fields. SaaS services can actually be deployed and provisioned through the use of PaaS and IaaS services. In order to make widespread use of actual cloud computing technologies in scientific application services, including geo-spatial data processing, the SaaS cloud service can be built on top of many prior studies in order to produce some valuable results. In particular, developers can expect to use cloud technologies for the application fields of service development, as the PaaS cloud provides support for middleware solutions, as well as other distributed computing resources. A great variety of applications can be developed and deployed through PaaS via the exploitation of middleware services and the building of the right and stable environment for the execution of these applications. Among the many issues related to open cloud computing applications, the performance evaluation was carried out to determine the need for PaaS in terms of geo-spatial image processing services.
As for the results, the execute request to OGC WPS 2.0 for an edge detection algorithm shows that a PaaS service via IaaS and PaaS clouds is better overall than IaaS without LBaaS. By applying an edge detection function, the performance of the WPS execute operation of a PaaS-based implementation is better than that of the IaaS without LBaaS. On the contrary, IaaS is better than PaaS regarding the ‘describe’ operation delay time. Under the same experiment conditions and variables for the performance tests, the error rate of most PaaS implementation cases was lower than those of the IaaS without LBaaS. Practical studies are needed to apply and link this technology to geo-spatial data application fields. These experimental results can be regarded as a reference point for the utilization of this technology in geo-spatial image processing services with the analysis functionalities of a large volume of data sets.