Development of a Design Methodology for Cloud Distributed Control Systems of Mobile Robots

This article addresses the problem of cloud distributed control systems development for mobile robots. The authors emphasize the lack of a design methodology to guide the process of the development in accordance with specific technical and economic requirements for the robot. On the analysis of various robots architectures, the set of the nine most significant parameters are identified to direct the development stage by stage. Based on those parameters, the design methodology is proposed to build a scalable three-level cloud distributed control system for a robot. The application of the methodology is demonstrated on the example of AnyWalker open source robotics platform. The developed methodology is also applied to two other walking robots illustrated in the article.


Introduction
The control infrastructure of modern mobile robots represents a distributed multiprocessor computing complex that includes computing modules on individual devices, external computing resources and means of exchanging data and commands over wireless networks [1]. For mass-produced mobile robots, both industrial and for use in the educational system and experimental laboratories, a cloud computing board is often created [2], including software and hardware for controlling mechanisms, as well as receiving and processing data from sensors; the solution of operational tasks is left on local nodes [3]. For the same type of robots, the use of a single cloud computing infrastructure improves the technical and economic characteristics, since cloud technologies reduce the requirements for expensive computing resources of each individual mobile robot [4]. For robotic systems that process large amounts of data and synthesize control actions for large dimensions of degrees of freedom, the task of forming an external computing infrastructure is one of the most important. Distributed control architectures are relevant when developing information and control systems for mobile robotic platforms that move in a heterogeneous environment with the need to recognize dynamic obstacles and interact with different objects or people. The implementation of these functions requires significant computational resources, which are not available due to restrictions on power consumption and the dimensions of the mobile robot. The use of cloud services with the use of wireless computer networks allows us to solve the tasks set, provided that the limit on the maximum feedback delay in the robot control loop is met [5,6].
The development of the element base, architectures of distributed computing complexes, the spread and improvement of the quality of wireless data transmission networks determines the directions for improving the methods of theoretical analysis and experimental research of the functioning of computing complexes to improve the technical, economic and operational characteristics of mobile robots.
However, the peculiarities of the structures, the tasks to be solved [37] and the criteria for the quality of functioning require the development of specialized methods aimed at improving the technical and economic characteristics of computing complexes.
Thus, the aim of this research is the development of a design methodology for constructing a three-level architecture of a walking robot computing complex with reactive, executive and application levels. The reactive level is implemented on the basis of a micro controller that provides control of actuators, processing data from sensors, and monitoring energy consumption. The executive level is implemented on a microprocessor with a full-fledged operating system, provides the basic functionality of the system (orientation in space, video processing, state automaton) taking into account the limitations on computing power and provides the access to the API. The application level is a distributed cloud software that solves computationally expensive tasks: physical modeling of motion, elements of artificial intelligence, and collective behavior.
The difference between the architecture and its analogues consists in taking into account the characteristics of computing resources and network infrastructure based on the specified technical and economic requirements and operating conditions.
The methodology is built upon on the set of parameters derived from the analysis of technical and economic requirements and operating conditions, which determines the characteristics of the information and network infrastructure of a cloud computing complex. Nine major parameters are formulated that have a significant impact on the distribution of resources between cloud services and local robot resources, as well as the characteristics of data transmission networks and acceptable delays in data exchange.

Background
The design of the on-board computer network includes the choice of the data exchange interface and the network topology. At the moment, the following types of interfaces for data exchange between various parts and electronic units of control computing complexes are most common: Let us summarize in a table the technical characteristics of the data exchange interfaces mentioned above (Table 1). Among laboratory stands, we should notice Cubli [43], which is a one-dimensional prototype of a cube that can balance at an angle. The control system of the cube includes two controlling units interlinked with the CAN interface. The first unit performs as the main controller and the second one controls the motor. The main unit is linked with the IMU sensors, the brake servo motor and the encoder. The SPI interface is used to connect the main unit with the sensors of IMU. STM3210E debug board for the Cortex M3-core STM32F103E micro controller is used to implement the main controller. The reasons for choosing that board include the rapid prototyping ability and the manifold support of the community. The operating system is FreeRTOS, a real-time operating system, which combines prioritization of multitasking with the relatively small kernel size of only 4 KB.
We should also mention the triple inverted pendulum on a cart described in [44]. The main controller is organized upon the dSPACE DS1103 [45] module for measurements and control. All the initial search and experiments were conducted in the MATLAB/Simulink. Some of the values were pre-calculated and the lookup tables were built on the basis of their interpolation. With the dSPACE DS1103 all the needed I/O configuration could be done in the MATLAB/Simulink environment and then run automatically in compiled form on the device. Thus, the speed of prototyping increased dramatically while the number of experiments was reduced.
Next, we should look at the electronic control units (ECU) of electric vehicles. The ECUs are used to group the electronic systems of the vehicle in the unified hardware and software architecture. Thus, the ECU can be responsible for control of more than one electronic subsystems of the vehicle, nevertheless multiple ECUs can be applied if appropriate. The second leads to the problem of reducing the number of wires with which ECUs are connected. The CAN Network can be pointed out as the most frequent solution of that problem [46][47][48][49]. The CAN is able to transfer data efficiently even when the electromagnetic interference is present. The transfer rate can be close to 1 Mbit/s, which is quite enough for a variety of tasks. In [46], the example of CAN Network providing the communication with all the devices, except for GPS, is described. To meet the requirements of specific devices, such as special network settings, filtering or isolation, the multiple subnets were organized with the network gateways (ADFWeb or dSPACE MicroAutoBox) to link them all. For the implementation of the main controller over all ECUs in the vehicle, the dSPACE MicroAutobox II was chosen. The PC of the car built upon Intel P8700 with 1 GB of RAM and 500 GB of hard disk space is Ethernet-connected with the dSPACE MicroAutoBox II. The dSPACE ControlDesk is used to make logs and perform experiments. To provide the fast prototyping the dSpace ControlDesk is binded with Simulink.
CAN bus is also very common in bipedal walking robots. For example, in HRP3 [50], iCub [51], and DRC-HYBO + [52]. EtherCAT [53,54] is also quite frequent. It solves the problem of the CAN bus, i.e., the impossibility of high-frequency access to low-level controllers (1 kHz) [54]. As it was noted in [55], PCI as well as PCI104 and PCIe are frequently used to link the network cards, motherboards and analog-to-digital converters. To interlink the motors and encoders, the CAN or EtherCAT are used, whose strength is reliability. The low limited bandwidth of the CAN bus should be noted (1 Mbit/s) and the disability of connecting lots of devices. To avoid this, several tires are frequently used in a humanoid kind of robots [50,51]. SERCOS III can be found in some robots [55], which has the similar principles to EtherCAT as well as similar performance. Consider the Walkman software and hardware architecture [53]. To control the movements of the robot, a COM Express computing module based on a Pentium i7 quad core processor was used. The Ethercat Master Device Manager is launched with it. It organizes the operation of the Ethercat slaves in terms of synchronization, real-time receiving and sending the data about their relative position. YARP [56] is applied as the remote access middleware for robot motors, as it combines high speed with low latency. In the initial stage of the development, there was a direct communication between the YarpServer and ROS Core. However, such a choice led to the frequent connection losses, and the interconnected YARP/ROS servers may not recover from this. To avoid that, the PC for control and the robot were separated with their own RosCore and YarpServer running.
EtherCAT is a fairly popular solution, especially for complex systems with a large number of sensors. For example, there are projects for robotic skin [57], four-legged [58,59] and six-legged [60][61][62] robots, medical rehabilitation robots and exoskeletons [63][64][65]. Here is a brief description of the architecture of another human-like robot on EtherCAT and ROS, Talos Pyr'ene, developed by PAL-Robotics [54]. The robot is equipped with two computers, each with a dual i7 CPU at 2.8 GHz, eight cores due to multi-threading. However, due to the fact that the RT-PREEMPT real-time operating system is used, only four cores are available on the control computer. Eight cores are available on the computer responsible for vision and high-level computing. All PAL-Robotics robots, including Talos Pyr'ene, are deeply integrated with the robotic operating system (ROS). The operating system is Ubuntu 14.04 LTS. Ros_control is actively used, which allows one to quickly move between simulation and tests on a real robot. The authors, however, note that although ROS control is easier to use than openRTM, it currently yields to openRTM in layout [66].
Aircraft flight controller architectures often include Vicon's motion capture system. The frequency of the system and the number of sensors used may vary. For example, in article [67], a Vicon system with a frequency of 200 Hz was selected. Vicon connects via Ethernet to the ground station on which the ROS is installed. Then, the data is sent via a UDP bridge via MAVLink (micro air vehicle communication) protocol to Pixhawk px4 autopilot from which commands to engine drivers are sent via I2C. In addition, there are variants that use the ROS-Matlab Bridge [68].
To control KUKA industrial manipulators Kuka KR C4 is used. The control is organized via the Central Cabine Control Unit (CCU), which interfaces with all the components as the main board. The KUKA Control PC (KPC) provides the user interface. The motor controls are provided via the KUKA Servo Pack (KSP). The robot power system has its own controller and is named KUKA Power Pack (KPP). The collecting of motor position and temperature data is performed with the Resolver Digital Converter (RDC). The safe operation of the robot is done with the Safety Interface Board (SIB). Moreover, a board for Ethernet, Dual NIC and SmartPAD as well as Controller System Panel (CSP) are connected to the CCU. SmartPAD serves as the operator panel. The following five interfaces are used to conned the above listed items to the CCU: The connection between the controller and the industrial Internet networks such as DeivceNet and PROFIBUS can be made via the KEB interface. The developers also claim the support of INTERBUS, EtherCAT, Ethernet/IP, VARANBUS and PROFINET with the KEB interface. Interesting to note, that Univeral Robots E-series collaborative robots have a control box which provides the support of only three industrial networks standards: PROFINET, EthernetIP and ModbusTCP. Besides, all the standards are described with the soft real time, indeterminacy and lack of guarantees for the transmission delay [69].

Development of a Cloud Control Architecture and Derivation of the Methodology
The scheme of the problem of the cloud computing architecture development is depicted in Figure 1. ); in the center, there is a cloud distributed control platform organized (in a common case) as a set of web-servers distributed across the world, balancing the traffic load and providing the simultaneous access for users to the robotic control web-services, the back-end of the control platform also performs heavy computations for the robots such as assessing the robot's environment with the computer vision techniques and automatic feedback-based correction of the robots parameters; the right part of the picture is a set of robots performing the operations under the control of the users from the left side, the communication between the user and the robot is done with the cloud distributed control platform.
We demonstrate the process of the cloud computing architecture development on the example of the AnyWalker walking mobile robot (Figure 2). AnyWalker is a nonanthropomorphic walking robot with the system of compensation of external impacts with motor-wheels that can stabilize the robotic system in 3 dimensions [20].
For the robot under consideration, taking into account the application of Robot Operating System (an open platform with plenty of sensor interfaces), a functional diagram of the distribution of tasks and data exchange channels in the cloud architecture of the control computing unit is depicted in Figure 3.
Analysis of the requirements for solving problems on the choice of software, hardware and network solutions for building a cloud platform, theoretical and experimental studies on a number of implementations of walking robots made it possible to form a threelayer architecture of a cloud-based distributed multiprocessor control computing complex ( Figure 4):

1.
A reactive level based on a micro-controller that provides control of actuators, data processing from sensors, and control of energy consumption.

2.
The executive level is implemented on a microprocessor with a full-fledged operating system, implements the basic functionality of the system (orientation in space, video processing, state automaton), taking into account the limitations on computing power and provides the access to the API. 3.
The application layer represents a distributed cloud application software that solves computationally expensive tasks: physical modeling of motion, elements of artificial intelligence, and collective behavior.   Based on the analysis of the conducted studies on the choice of the characteristics of the computing complex, network protocols, and the construction of the information infra-structure, a set of parameters was formed that determines the characteristics of the information and network infrastructure of the cloud computing complex of walking robots, based on the given technical and economic requirements and operating conditions. The 9 most significant parameters were derived:

1.
General technical requirements for the implementation: weight and dimensions, satisfaction of the requirements of industry standards for integration into existing processes, and the application of specialized software packages (ROS, MATLAB, etc.).

2.
Assessment of the number of connected sensors: based on estimates of the number of sensors necessary to evaluate the load on the network. This assessment determines the choice of network interfaces and standards (EtherCAT, SERCOS III, etc.).

3.
Specification of the number of drives and non-motorized degrees of freedom. The number of drives determines the workload of the nodes of the computing complex, i.e. the workload of servers and communication channels. For a significant number, it is necessary to divide the computing complex into modules and/or use parallel channels to separate groups of motors.

4.
Assessment of the need for rapid prototyping: when rapid prototyping is required, it is necessary to use MATLAB/Simulink and systems that support them, such as DSpace DS1104, MicroAutoBox or other external computers for development and debugging.

5.
Assessment of the criticality of fault tolerance requirements: if one of the main requirements for the architecture is reliability (for example, the braking system of a car or the autopilot system of a copter), you should choose, for example, the CAN Network, which has proven itself as a network with high fault tolerance. 6.
Determination of computing power for the operation of the system: if large computing power is required, which for one reason or another cannot be placed in the device being developed, then they can be transferred to an external computing module. Calculations can also be divided logically between different nodes of the system. 7.
Evaluation of the criticality of the noise immunity of the device: the popular solution in the case of the criticality of noise immunity is CAN. However, industrial Internet standards (EtherCAT, SERCOS III and others) also have good noise immunity. 8.
Evaluation of the limitation on the distance between interacting modules: for example, CAN, with all its advantages in reliability, at distances over 30 m significantly loses in transmission speed. 9.
Assessment of real-time requirements: depending on the system, a network may be required that guarantees a hard real-time. An estimate of the permissible delay in the robot control loop can be carried out.
Based on the identified set of parameters and the constructed architecture, a methodology for constructing a three-level architecture of a computational control complex of walking robots is formed:

1.
Evaluation of parameters based on the analysis of technical and economic requirements and operating conditions that determine the characteristics of the information and network infrastructure of the general computing complex.

2.
Distribution of functional tasks for each of the three levels of the computing complex: reactive, executive and application levels.

3.
The choice of data exchange technologies, microprocessors, based on the obtained estimates of parameters and tasks to be solved.

4.
Formation of the information and network infrastructure of the complex.

Implementation
In accordance with the developed methodology, a control computing complex was developed for AnyWalker ( Figure 5). According to the methodology (Step 1), evaluation of parameters based on the analysis of technical and economic requirements and operating conditions has been done: The first parameter, general technical requirements, for AnyWalker assumed a multipurpose application in heterogeneous operating conditions and interaction with IoT devices. It was necessary to provide the possibility of connecting a large number of sensors from different manufacturers. Therefore, the choice was stopped on ROS, as a platform in the open database of which there are interfaces for the overwhelming number of sensor manufacturers.
The second parameter, assessment of the number of connected sensors, lead to the choice of high bandwidth onboard computer network based on UART as the number of connected sensors would be large.
The third parameter, determination of the number of drives and non-motorized degrees of freedom, was equal to 27 motorized DoF, so it was necessary to provide synchronous operation of different drives with low latency. A search was made for variants of the distribution of the architecture of the computing complex.
The first evaluated version of the controller was STM32F4DISCOVERY Discovery kit based on a microcontroller STM32F407VG. The flash memory of the microcontroller contains the program code for interacting with EPOS2 and GL-SVG-02/2, as well as the code of the linear quadratic controller. This option is the simplest and most compact of all evaluated as the delays are minimal due to the simplicity of the system. However, when scaling the system, it will be necessary to control three flywheels at once, it will be necessary to connect a large number of sensors and increase the amount of program code for the functioning of the remaining systems of the robotic walking platform. In this case, a better choice would be to connect all devices to a controller with a Robotic Operating System, ROS. This was done in the following configuration. Therefore, next stage of research was serial connection of two software controllers, physically located on different platforms. A Raspberry Pi 3 minicomputer with the Ubuntu 16.04 operating system and the ROS Kinetic framework installed on it was connected to the STM USART interface. An ASUS GL553V laptop with an Intel Core i7-7700HQ 2.80 GHz processor and 16 GB of RAM with the Windows 10 Home operating system, as well as the MATLAB version 2017b environment, was connected to the Raspberry Pi 3 via the UTP LAN interface.
The third controller option is STM32 NUCLEO-144 BOARD instead of STM32F4DISCO VERY in the second variant. The reason for this choice is the presence of an RJ45 connector on the Ethernet board, which allows one to increase the speed of the connection between STM and Raspberry Pi 3 from 4 to 100 Mbit/s. In this version STM, Raspberry Pi 3 and laptop are connected to the switch.
The choice was made on the third variant of the architecture implementation, since this option allowed for the transmission of significant parameters of the state of the robotic system with a frequency of more than 100 Hz and the frequency of the control cycle of the stabilization algorithm of more than 200 Hz.
The evaluation of the fourth parameter, assessment for the need of rapid prototyping, lead to the conclusion that it is extremely important to be able to test various configurations during the prototyping process, which was made possible due to the ROS-MATLAB linkage and MATLAB code generation.
For the fifth parameter, assessment of the criticality of fault tolerance requirements, since reliability is a key requirement for the hardware and software architecture of the walking platform, a CAN Network with high fault tolerance was used to organize the on-board network.
For the sixth parameter, determination of computing power for the operation of the system, a large computing power was required for robot stabilization and computer vision sensors operation processing, therefore an external MATLAB Speedgoat computer was accounted in the architecture.
For the seventh parameter, evaluation of the criticality of the noise immunity of the device, it was suggested that AnyWalker could operate in severe operating conditions with electromagnetic fields of high intensity, so the CAN interface was chosen to provide the noise immunity of the device.
For the eighth parameter, evaluation of the limitation on the distance between interacting modules, due to the compactness of the placement of onboard computing modules of AnyWalker, CAN was used without a significant loss of data transfer speed on board the robot.
For the ninth parameter, assessment of real-time requirements, The necessity of estimating the permissible delay in the feedback loop was determined. After the analysis, it was concluded that those delays should not exceed 200 ms. At Step 2 of the methodology, the distribution of functional tasks was done for each of the three levels: The reactive level is implemented on the STM32F407 micro controller. The IMU6050based accelerometers/gyroscopes are polled via the I2C bus and the data is filtered using the Madgwick sensor fusion algorithm. The movement of the flywheels and the calculation of the speed are carried out using Maxon EPOS2 controllers with Maxon EC motors. Dynamixel MX106T actuators for robot legs are controlled via RS485, using a MAX485-based converter.

2.
The executive level is based on the Raspberry Pi controller with the Robot Operating System installed and is connected to the reactive level via the RS-485 bus. This level implements a simple autonomous behavior, a state machine, provides security and emergency shutdown. Data transfer to the cloud is carried out using Bluetooth, WiFi or Ethernet, if necessary.

3.
The application layer provides software installed on a personal computer or smartphone, fully or partially located in the cloud. The application layer provides a high-level user interface, supports the API of cloud voice recognition services, collects data from sensors and control commands for machine learning purposes, connects several AnyWalker robots to provide a pattern of collective behavior. It is possible to use algorithmic control support as an information service that allows third-party developers to use the API to solve application problems. An application for the Android platform has also been developed to send motion commands to AnyWalker and display the result.
At Steps 3-4 of the methodology, the choice of data exchange technologies, microprocessors, based on the obtained estimates of parameters was solved ( Figure 6).

At
Step 5, the application development and cloud services configuration were done [70][71][72]. At Step 6, the architecture was finally implemented.
The core components in the scheme are the STM32F407 micro-controller for the periphery and the Raspberry Pi 3 microcomputer serving as a top-level controller with interfaces for the inter-operation of control software and a vast majority of clients with peripheral devices.
The controller of the periphery interacts wia the SPI protocol with a 9-axis inertial navigation system consisting of gyroscopes, accelerometers and magnetometers, an MPU9250 chip. The equipment installed in the robot is controlled via the EPOS2 motor drivers with the CAN communication protocol. The legs of the robot are driven with Dynamixel MX-106 servomotors, the connection of which is done with the RS-485 protocol. The 6 elements per foot servo series is used. Both buses are connected to two independent UART interfaces of the micro-controller.
The UART interface is used for the interaction of the upper-level controller and the controller of the periphery. The peripheral control codes are sent with that interface. The data from the INS sensors, the flywheels rotation speed, the values of the temperature and the servo loads are requested with UART as well. The STM32F407 controller is also connected via USB for debugging and downloading updated software using the STLink v2 protocol.
Clients which implement various logic elements are also provided by the scheme. The high-level controller is connected to the clients via an Ethernet channel or a wireless WiFi network.
In accordance with the chosen architecture, the main mathematical calculations are performed in a cloud environment, a high-speed data exchange with the robot board is provided via a broadband communication channel. At the same time, the computers on board the robot control the servos and provide primary filtering of data coming from the sensors. For example, a separate ARM computer is installed on the gyroscopes and accelerometers unit. The cloud infrastructure in this robot provides application software interfaces for programming the robot in the MATLAB environment and controlling the robot using an Android tablet. Figure 7 demonstrates AnyWalker operating under synthesized cloud computing architecture in the process of climbing an obstacle. The modeling of the process of head load distribution was performed to assess the characteristics of the computing complex under the synthesized architecture. The initial data for modeling were as follows: • The value of the medium temperature is 50°C; • The value of the peak processor power of 65 Watts; • The value of the peak power on the converter is 40 Watts; • The value of the peak power on the input filter is 6 Watts.
The results of the scattering simulation are shown in the Figures 8 and 9.
The simulation results show the high efficiency of the developed architecture in the distribution of thermal loads.
The results obtained were used to design the architectures of cloud distributed computing systems of two other walking robots (Figures 10 and 11).
Thus, a design methodology for constructing a three-level control architecture for walking robots with a reactive, executive and application level has been developed.

Discussion
The use of the proposed design methodology allows us to scale, change the composition of equipment and the technologies used at each of the three levels of the cloud architecture.
The proposed methodology is based on a set of parameters derived from the analysis of technical and economic requirements and operating conditions of various mobile robots. The set determines the characteristics of the information and network infrastructure of the cloud computing complex: weight and dimensions, meeting the requirements of industrial standards for integration into existing technological processes; the use of specialized application software packages (ROS, MATLAB, etc.); the number of connected sensors; the number of drives and non-motorized degrees of freedom; the need for rapid prototyping; fault tolerance; the need for computing power; noise immunity; the distance between devices; and the requirement for the speed of information exchange in a distributed computing complex.
A three-level architecture of the AnyWalker walking robot computing complex was developed, a prototype was created and experimental research and experimental operation of the computing complex were carried out. Architectures have been developed and implemented in computing complexes for walking robots Quadruped and RGSS. The results of the implementation and conducted experimental studies of the implementations of computing complexes have demonstrated the effectiveness of the developed methodology.
As a possible limitation of the approach proposed, we should notice the need of predetermination of all the nine parameters before the architecture development, as the further change in the operating requirements could affect the very basis of the architecture and the need for the complete re-implementation of the architecture would be present. For example, the limited computing power without the possibility for a connection of an external computer (as the value of the sixth parameter) could become an obstacle if the amount of the data would grow rapidly in future.

Conclusions
The review of methods of designing computing services and platforms for robots is carried out showing that there is a growing number of cases of application of cloud computing resources to the distribution of tasks for controlling mobile robotic complexes.
The design methodologies and architectures of computing complexes of mobile robotic systems are analyzed, which made it possible to identify a variety of available software and hardware solutions that have their own advantages and disadvantages.
A methodology for constructing a three-level architecture of a walking robot computing complex with a reactive, executive and application level has been developed. Implementation of the developed architecture allows one to scale, change the composition of equipment and the technologies applied at each level.
A set of parameters based on the analysis of technical and economic requirements and operating conditions has been formed, which determines the characteristics of the information and network infrastructure of a cloud computing complex: weight and dimensions, meeting the requirements of industrial standards for integration into existing technological processes; the use of specialized application software packages (ROS, MATLAB, etc.); the number of connected sensors; the number of drives and non-motorized degrees of freedom; the need for rapid prototyping; fault tolerance; the need for computing power; noise immunity; the distance between devices; nd the requirement for the speed of information exchange in a distributed computing complex.
A three-level architecture of the AnyWalker walking robot computing complex has been developed, a prototype has been created, and experimental research and pilot operation of the computing complex have been carried out. Architectures have been developed and implemented in computing complexes for walking robots Quadruped and RGSS. The results of the implementation and experimental studies of robotic control systems have demonstrated the effectiveness of the developed methodology.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.