The container-based cloud is used in various service infrastructures as it is lighter and more portable than a virtual machine (VM)-based infrastructure and is configurable in both bare-metal and VM environments. The Internet-of-Things (IoT) cloud-computing infrastructure is also evolving from a VM-based to a container-based infrastructure. In IoT clouds, the service availability of the cloud infrastructure is more important for mission-critical IoT services, such as real-time health monitoring, vehicle-to-vehicle (V2V) communication, and industrial IoT, than for general computing services. However, in the container environment that runs on a VM, the current fault detection method only considers the container’s infra, thus limiting the level of availability necessary for the performance of mission-critical IoT cloud services. Therefore, in a container environment running on a VM, fault detection and recovery methods that consider both the VM and container levels are necessary. In this study, we analyze the fault-detection architecture in a container environment and designed and implemented a Fast Fault Detection Manager (FFDM) architecture using OpenStack and Kubernetes for realizing fast fault detection. Through performance measurements, we verified that the FFDM can improve the fault detection time by more than three times over the existing method.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.