UAV and IoT-Based Systems for the Monitoring of Industrial Facilities Using Digital Twins: Methodology, Reliability Models, and Application

This paper suggests a methodology (conception and principles) for building two-mode monitoring systems (SMs) for industrial facilities and their adjacent territories based on the application of unmanned aerial vehicle (UAV), Internet of Things (IoT), and digital twin (DT) technologies, and a set of SM reliability models considering the parameters of the channels and components. The concept of building a reliable and resilient SM is proposed. For this purpose, the von Neumann paradigm for the synthesis of reliable systems from unreliable components is developed. For complex SMs of industrial facilities, the concept covers the application of various types of redundancy (structural, version, time, and space) for basic components—sensors, means of communication, processing, and presentation—in the form of DTs for decision support systems. The research results include: the methodology for the building and general structures of UAV-, IoT-, and DT-based SMs in industrial facilities as multi-level systems; reliability models for SMs considering the applied technologies and operation modes (normal and emergency); and industrial cases of SMs for manufacture and nuclear power plants. The results obtained are the basis for further development of the theory and for practical applications of SMs in industrial facilities within the framework of the implementation and improvement of Industry 4.0 principles.


Motivation
The serviceability of control systems and monitoring of the condition of industrial facilities have emerged as a separate class of complex and ultra-complex systems. This is because of their growing importance in the context of, above all, safety; the efficiency of information; their processing and decision making to minimize the risk of costly equipment failures; the occurrence of emergency or pre-emergency situations and their prevention; and the reduction of the consequences of accidents for enterprises, the environment, as well as the residents of territories.
The relevance of improving the monitoring systems of industrial facilities has grown in recent decades for several reasons: use of Internet of Drones (IoD) technologies in UAV fleets. Paper [52] presents the results of using heterogeneous UAV fleets for monitoring in the field of nuclear energy and analyzes the advantages of heterogeneous fleets over homogeneous ones.
In [53,54], the issues of creating UAV-based monitoring systems are considered. However, it discusses the aspects of carrier selection and payload list only, and the general composition of the fleet (number, organization of management, interaction, etc.) is not determined.
Papers [55][56][57][58][59] study the management processes of monitoring systems, which are based on UAVs in various aspects, namely: self-organization in wireless networks [55], flight safety [55,56], the control and coordination of various types of robot [54], and issues of human interaction with UAVs [58,59]. Features of UAV and ground robot planning in monitoring systems and other applications are studied in [60][61][62]. However, these papers pay more attention to the aspect of flight safety, and the construction and maintenance of a given group in the fleet. Planning in UAVs, considering the indicators of their reliability and effectiveness in missions, is presented in [63][64][65]. Security issues regarding the utilization of UAV-based networks in IoT scenarios are considered in [66,67].
The group application of UAVs requires special approaches to the construction of monitoring systems as well as to their management and organization. High efficiency in the creation of such systems is provided by the introduction of multi-agent systems with features [68][69][70].
Although the design of systems with UAVs is described in many works above, their results are separate fragments. So, it cannot be used entirely for developing intelligent systems for monitoring potentially dangerous objects. This makes it necessary to conduct research related to:

•
The development of monitoring system structures considering the different components of industrial facilities and the environment, and the application of technologies such as IoT, DTs, UAVs, etc. [5,71,72]. • Improving the reliability models of monitoring systems and researching their dependence on the reliability of subsystems, particularly those based on UAV fleets [73,74]; means of measuring, transmitting, and processing the information; as well as their integration using the DT.

•
Monitoring system dependability studies, considering the possible degradation of systems due to channel failures and the corresponding reduction in the "monitoring coverage" of industrial facilities and their surrounding areas. In this case, it is advisable to use models of multi-state systems (MSSs) [75].

Objectives
Summing up the analysis of related works above, it should be noted that there are challenges of a theoretical and applied nature in creating a single methodology for developing architectures for monitoring systems for complex industrial facilities using UAVs, as well as models and methods for assessing the reliability of such monitoring systems. Hence, the goal of this paper is to develop the methodology for designing (building) two-mode monitoring systems for industrial objects and their adjacent territories based on the application of modern mobile and digital technologies, as well as a set of proper reliability models.
The objectives of the paper are as follows: • To develop a methodology for building and creating general models (structures) of UAV-, IoT-, and DT-based monitoring systems in complex industrial facilities as multi-level and multi-state systems; • To develop and explore reliability models of the monitoring systems considering applied mobile and digital technologies, operation modes (normal and emergency), etc.; • To propose and discuss industrial cases of monitoring systems for manufacturing and nuclear power plants.

Approach and Structure
The approach to research includes the following provisions. The conception and principles of the building and general structure of systems for monitoring industrial facilities are proposed; they are:

•
Parts of critical infrastructures such as nuclear power plant utilities, dangerous manufacturers, oil and gas transport communications, etc. • Described using a multilevel hierarchy scheme, and based on the application of the technologies: (a) Digital twins as models of controlled sub-objects; (b) UAV fleet as an additional channel for collecting information; (c) A private cloud system as a redundant emergency center for decision-making support.
The monitoring system is analyzed as a complex system in terms of dependability considering: • Various monitoring system structures, options for sub-object monitoring, and the configuration of different stationary and mobile centers for collecting information and decision making; • Modes of monitoring system operation (normal and emergency modes) that vary in terms of the environment and failure rates of the components; • The placement and reliability of digital twins generated by different centers and their influence on decision making; • The placement and reliability of decision-making units; • The processes of monitoring system degradation caused by components and channels failures. In this case, the monitoring system is addressed as an MSS.
Thus, the overall contribution of this research covers the development of the concept and principles for dependable SM building by using mobile technologies (UAVs and UAV fleets), the Industrial IoT, edge computing, and digital twins. The suggested principles allow us to implement options for the structural organization of SMs for complex industrial facilities, described using a three-tier hierarchy (equipment-utility-zone/adjacent area).
The rest of the paper is structured in the following way. The proposed concepts and principles are described in Section 2. In Section 3, the SM is analyzed and developed as a dependable system. Section 4 contains the two industrial cases: pre-and post-accident nuclear power plant monitoring systems, and a subsystem of monitoring equipment using IoT and private cloud. In Section 5, the results of the research are discussed. Section 6 highlights the conclusion and directions for future work.

Concept of SM Building
For critical industrial facilities and monitoring systems, as their key component, there is a contradiction in the context of the development of mobile, information, and smart technologies. On the one hand, there are strict requirements for the reliability, safety, and survival of monitoring systems in industrial facilities in the pre-and post-accident period regarding the failure of sensors, communications, processing equipment, and control points. At the same time, the capabilities of UAVs and the Internet of Drones for measuring, transmitting, and processing information are growing. On the other hand, concepts and methods for creating and using reliable and resilient monitoring systems in industrial facilities under conditions of failure and accidents have not been sufficiently developed.
To resolve this contradiction, the concept of building reliable and resilient monitoring systems is proposed, based on the development of the von Neumann paradigm of the synthesis of reliable systems from unreliable components, further based on the "system components-types of redundancy" matrix [71][72][73]. For complex monitoring systems in industrial facilities, the concept covers the application of various types of redundancy (structural, version, time, and space) for basic components: sensors and means of converting measurement results, and means of communication, information transmission, and information processing. A UAV fleet is a key component of a redundant, multi-version, and dynamically reconfigurable SM because the application of UAVs allows for the implementation of specific kinds of structural, informational, and version redundancy that can be replenished. This increases the survivability of the SM in extreme conditions. Moreover, an additional kind of information and version redundancy is the presentation of the state of the monitored object in the form of digital twins for decision support centers.
The concept is also complemented by the fact that to ensure the resilience of monitoring systems, the use of diversity and a spatial distribution of decision-making centers with clearly defined functions and priorities are proposed. In addition, the diversity of control methods is due to the use of DTs and cloud data processing, as well as data transfer methods using floating UAVs that form dynamically reconfigurable structures that are resistant to physical and cyber intrusion and can be recovered. The presence of mobile and protected cloud subsystems offers the possibility of dynamic and proactive reconfiguration of assets in the event of failures, physical impacts, and cyberattacks. During the development of the structure and models of the monitoring system, a three-tier model of the presentation of the industrial facility was adopted, which includes:

•
Equipment with sensors and critical-process management devices (equipment that is monitored and controlled, EC). Various systems of such monitoring are used in industrial facilities. They are based on wireless and IoT technologies and do not interfere with technological processes, without creating additional risks from the point of view of ensuring the safety of objects. • Several systems and building complexes, within which equipment with sensors and devices for the management of critical processes (the utility that is monitored and controlled, UC) were placed. For example, for a nuclear power plant, this level covers systems and equipment whose condition is monitored by the Post-Accident Monitoring System [76].

•
The territory of the industrial facility, which is limited by the outer perimeter, where monitoring stations (MSs) are located (zones that are monitored, ZC). For a nuclear power plant, this corresponds to the area whose condition is monitored by the Automated Radiation Monitoring System [77].
Next, we consider the option of building an industrial facility, on the territory of which there are several UCis (i = 1 . . . n) and MSs of the same type.

Applied Technologies
In the process of building and operating the monitoring system, the following technologies are expected to be used:

•
DTs, which ensure the creation of digital clones in an industrial facility. This makes it possible to model and determine its state from various data on the state of its component systems and objects. The creation and use of DTs of different complexity for different crisis centers are envisaged. The model uses a digital twin instance which describes a specific object with which the twin remains associated through the life of the object. Duplicates of this type usually contain an annotated 3D model, which takes into account the measurement results received from the sensors, as well as the current and predicted values of the monitoring parameters.

•
IoT and Internet of Flying Things to reserve wired data transmission channels from sensors and critical-process control devices, commonly equipped with previousgeneration Post-Accident Monitoring Systems and Automated Radiation Monitoring Systems. Post-Accident Monitoring Systems were created in the first decade after the Fukushima accident [76,77]. • Edge computing technologies for data pre-processing at every level of the industrial facility monitoring model using edge nodes (EN), which make it possible to reduce data volumes, their transmission speed, and the requirements for the performance of transmission equipment. Data pre-processing also increases the efficiency of decisionmaking processing in crisis centers. The model of the monitoring system aims to

Redundant Centers of Control and Monitoring
To ensure the dependability and resilience of monitoring the condition of the object, crisis centers using the UAV fleet (IoD) and cloud services are utilized. The backup model provides additional centers to the usual (for example, for a nuclear power plant) control room (CR) and emergency control room (ECR):

•
The private cloud crisis group (PCG), which processes monitoring data using cloud services and the Internet of Things; • The UAV Fleet and flying control center (FCC), which receive and processes data on the condition of an industrial facility using equipment that can be placed either on one powerful UAV or under a group of UAVs (UAV fleet or IoD), can be distributed. The IoD is a kind of Internet of Flying Things based on a set of interacting drones. For the considered SMs, the IoD is a part of the flying control center (FCC) infrastructure with access to both the Private Cloud and Internet resources. Drone on-board systems collaborate with each other, ground sensors (located at the MSs), and the FCC.
Monitoring systems operate in normal and emergency modes according to the condition of the industrial facility. Additional control and monitoring centers of industrial facilities, namely PCG and FCC, are involved in the emergency mode. Under certain circumstances, they can perform certain monitoring functions in normal mode, but within the framework of this study, they only work in emergency mode.
In this case, the organization and functions of the SM based on CR, ECR, PCG, and FCC are described in Tables 1 and 2. CR, ECR, PCG, and FCC are separate subsystems of monitoring consisting of sensors (Sen), communication (Com), data processing (Prc), and decision-making support means (DMU). Table 1 describes the functions of these subsystems considering the levels of industrial facility hierarchy (EC, UC, and ZC). Table 2 presents the capacity of monitoring functions in the normal and emergency modes.  The CR subsystem in both modes performs all monitoring functions at all levels of the hierarchy. The ECR subsystem differs in that it does not perform equipment monitoring (EC). The FCC and PCG subsystems provide monitoring at the ZC level only in emergency mode, in full or in part, depending on the sensor coverage of the surrounding areas. The ZC sensors for CR, ECR, PCG, and FCC can be general or separate.  This GSM model describes the structural components of the SM, the relationships between which are shown in Figure 1. Reliability models of the SMs are described and explored in the next sections.  Figure 1. The structures of the industrial facility monitoring system: the T1-type structure which, in addition to the DTs for CR and ECR, requires DTs (depicted as dashed rectangles) for the PCG and FCC, and the T2-type structure which does not require them. The dot-dash arrows are utilized to show the channels for information exchange between the CR and other subsystems. The structures of the industrial facility monitoring system: the T1-type structure which, in addition to the DTs for CR and ECR, requires DTs (depicted as dashed rectangles) for the PCG and FCC, and the T2-type structure which does not require them. The dot-dash arrows are utilized to show the channels for information exchange between the CR and other subsystems.

Structures of SMs
The structures of industrial facility monitoring systems with different concepts of the use of digital twins are shown in Figures 1 and 2 (types T1 and T2). The first structure ( Figure 1) describes the monitoring system using DTs designated as Dtw in each crisis center: Dtw_CR in the CR center, respectively; Dtw_ECR; Dtw_PCG; and Dtw_FCC. The second structure (Figure 2) of the monitoring system involves the use of DTs only in CR and ECR-Dtw_CR and Dtw_ECR, respectively.
In the proposed structures, CR receives data on the monitoring of the condition of all the components of the three-level model of an industrial facility-EC, UC, and ZC. The transmission of monitoring information from equipment EC ij , i = {1 . . . , n}, j = {1, . . . , m i } and stations MS1, . . . , MSk is performed either by nodes ENi and FEN, respectively, which perform data pre-processing, or directly from several UCi sensors.
ECR receives monitoring data from US1,..., UCn, and ZC. PCG and FCC only process ZC monitoring data. All the shelters have appropriate Prc tools for data processing. The decision on the condition of the facility is made only in CR and ECR; for this purpose, decision-making units (DMU) are provided in their composition: DMU_CR and DMU_ECR. All the centers interact with each other (interaction channels are indicated by a dotted line). form data pre-processing, or directly from several UCi sensors.
ECR receives monitoring data from US1,..., UCn, and ZC. PCG and FCC only process ZC monitoring data. All the shelters have appropriate Prc tools for data processing. The decision on the condition of the facility is made only in CR and ECR; for this purpose, decision-making units (DMU) are provided in their composition: DMU_CR and DMU_ECR. All the centers interact with each other (interaction channels are indicated by a dotted line).    Table 3 provides options for the structures of facility monitoring systems in normal and emergency mode, taking into account data sources, relevant centers, and the type of monitoring system (T1 or T2).

CR
Further details and research on SM structures will be given in Section 3 based on the analysis of various sensor coverage options of the ZC parameter space for PCG and FCC subsystems, and on the development of appropriate reliability models. The presence of two, three, or four monitoring subsystems, i.e., the involvement of PCG and FCC subsystems, refers to the completeness of ZC option field coverage, which may be limited for them (marked with an asterisk). The relevant coverage models are analyzed below.

Models of SM Option Field-Monitoring Coverage
Let us assume that each SM subsystem channel comprises either communications, a processor, and a digital twin (for the type-T1 SM structure) or communications and a processor (for the type-T2 SM structure). DTs' compatibility with different subsystems is ensured as follows:

•
For the EC, the DTs are formed only in CR, so there is no compatibility problem here.

•
For UC, DTs are formed in CR and ECR from common sensors, so the data for generating DTs are identical, and, therefore, their compatibility is ensured. In addition, these data are corrected by the DMU in cases of failure. • For ZC, DTs are formed by all the subsystems (CR, ECR, PCG, and FCC) from sensors located in the ZC zone. If all subsystems use the full amount of monitoring data, both the compatibility and reliability of DT formation are ensured as in the previous case, using the DMU. If PCG and FCC subsystems use a limited data set from sensors, these data are employed to form the corresponding part of DTs and combine it with data from CR and ECR to ensure the DT's compatibility.
In emergency mode, the SM can utilize either one (PCG/FCC) or two (PCG and FCC) channels in addition to the CR and ECR channels.
Depending on the tasks performed by the SM, the following features of SM channels' functioning should be considered: • Channels can perform the same set of monitoring functions; • The set of monitoring functions for some channels can be considered as a subset of monitoring functions for other channels; • Sets of monitoring functions for individual channels may overlap.
These features of the SM channels mentioned above can form various variants of allocation for monitoring functions (VAMFs) among the SM channels. Each VAMF allows us to obtain the corresponding model of SM option field-monitoring coverage, as well as forming reliability block diagrams (RBDs). The latter are needed to assess the reliability of SM, which uses such VAMFs.
Let us introduce the following notations: F CR , F ECR , F PCG , and F FCC are sets of monitoring functions performed by the CR, ECR, PCG, and FCC channels, respectively.
Considering the presented features of SM channel functioning and the accepted notations, five models of SM option field-monitoring coverage were developed ( Figure 2).
As we can see from Figure 2, the largest and lowest numbers of RBDs correspond to VAMF5 (four RBDs, Figure 2e) and VAMF1 (one RBD, Figure 2a), respectively.

Extended Specification for SM Structures
Considering the specification of the SM structures presented in Table 3 and the features of the models of SM option field-monitoring coverage for VAMF, an extended specification was developed (Table 4).
This specification, in addition to the source of information (EC, UC, or ZC), mode of functioning (N or E), and type of structure (T1 or T2), which are presented in Table 3, provides information on: − The third channel for the three-channel SM structure (PCG or FCC); − The designation of the SM structure; − The reliability block diagram for the SM; − The notations of the SM structure and SM reliability function; − The equation for calculating the SM reliability function.

Notations and Assumptions
The used notations are as follows: Comα_β is the communications between α and β, where α = EC, UC, ZC and β = CR, ECR, PCG, FCC.
t is the operating time. λ is the basic failure rate corresponding to the failure rate of ComFCC. k i is a coefficient by which the failure rates of Comα_CR and Comψ_ECR must be multiplied to obtain their failure rates for the emergency mode, where α = EC, UC, ZC and ψ = EC, UC, ZC.
k E is the coefficient by which the failure rates of ComEC_CR, ComUC_CR, ComZC_CR, ComUC_ECR, and ComZC_ECR must be multiplied to obtain their failure rates for the emergency mode.
2k E is the coefficient by which the failure rate of ComZC_ω must be multiplied to obtain its failure rate for the emergency mode, where ω = PCG, FCC.
The used assumptions are as follows: • Components of the SM have an exponential time to failure; • During the operating time, the SM is considered an unrecoverable system.

Description and Simulation of the Reliability Models
Block diagrams and equations for SM reliability assessment (see Table 4) are presented below.

Notations and Assumptions
The used notations are as follows: Comα_β is the communications between α and β, where α = EC, UC, ZC and β = CR, ECR, PCG, FCC.
is the operating time. is the basic failure rate corresponding to the failure rate of ComFCC. is a coefficient by which the failure rates of Comα_CR and Comψ_ECR must be multiplied to obtain their failure rates for the emergency mode, where α = EC, UC, ZC and ψ = EC, UC, ZC. Е is the coefficient by which the failure rates of ComEC_CR, ComUC_CR, ComZC_CR, ComUC_ECR, and ComZC_ECR must be multiplied to obtain their failure rates for the emergency mode.
2 Е is the coefficient by which the failure rate of ComZC_ω must be multiplied to obtain its failure rate for the emergency mode, where ω = PCG, FCC.
The used assumptions are as follows: • Components of the SM have an exponential time to failure; • During the operating time, the SM is considered an unrecoverable system.

Description and Simulation of the Reliability Models
Block diagrams and equations for SM reliability assessment (see Table 4) are presented below.
where _ = _ Figure 5. RBD that is utilized to calculate the reliability functions of systems S UC_N 2_2 and S UC_E 2_2 using Equations (5) and (6), respectively. Figure 4. RBD that is utilized to calculate the reliability functions of systems _ and _ using Equations (3) and (4), respectively.
where _ = _ Figure 6. RBD that is utilized to calculate the reliability functions of systems S ZC_N 2_1 and S ZC_E 2_1 using Equations (7) and (8), respectively.
ComZC_PCG Prc_PCG Figure 9. RBD that is utilized to calculate the reliability function of system ( )_ _ using Equation (12).  . RBD that is utilized to calculate the reliability function of system S ZC_E 3(PCG)_1 using Equation (11).
where  Figure 8. RBD that is utilized to calculate the reliability function of system ( )_ _ using Equation (11).   . RBD that is utilized to calculate the reliability function of system S ZC_E 3(PCG)_2 using Equation (12).  Figure 8. RBD that is utilized to calculate the reliability function of system ( )_ _ using Equation (11). ComZC_PCG Prc_PCG Figure 9. RBD that is utilized to calculate the reliability function of system ( )_ _ using Equation (12).  Figure 10. RBD that is utilized to calculate reliability function of system ( )_ _ using Equation (13). Figure 10. RBD that is utilized to calculate reliability function of system S ZC_E 3(FCC)_1 using Equation (13).
where  ComZC_FCC Prc_FCC Figure 11. RBD that is utilized to calculate reliability function of system ( )_ _ using Equation (14). Figure 11. RBD that is utilized to calculate reliability function of system S ZC_E 3(FCC)_2 using Equation (14). Dtw_CR ComZC_FCC Prc_FCC Figure 11. RBD that is utilized to calculate reliability function of system ( )_ _ using Equation (14).  Figure 12. RBD that is utilized to calculate the reliability function of system _ _ using Equation (15).  Figure 13. RBD that is utilized to calculate the reliability function of system _ _ using Equation (16).
Using Equations (1)-(16), some dependencies were obtained (Figures 14 and 15) where the initial data are as follows: = 0.001 , = 0.0001 , _ = 0.2, Figure 12. RBD that is utilized to calculate the reliability function of system S ZC_E 4_1 using Equation (15). Dtw_CR ComZC_FCC Prc_FCC Figure 11. RBD that is utilized to calculate reliability function of system ( )_ _ using Equation (14).  Figure 12. RBD that is utilized to calculate the reliability function of system _ _ using Equation (15).  Figure 13. RBD that is utilized to calculate the reliability function of system _ _ using Equation (16).
Using Equations (1)-(16), some dependencies were obtained (Figures 14 and 15) where the initial data are as follows:   The analysis of the dependencies obtained allowed us to draw the following conclusions.    The analysis of the dependencies obtained allowed us to draw the following conclusions.  The analysis of the dependencies obtained allowed us to draw the following conclusions.

•
At t = 6 h, the increase in the emergency coefficient k E from 1 to 12 leads to a decrease in the values of the reliability functions P ZC_E 2_1 , P ZC_E 3(FCC)_1 , and P ZC_E Among the systems S ZC_E 2_1 , S ZC_E 3(FCC)_1 , and S ZC_E 4_1 , the most reliable system is S ZC_E 4_1 and the most unreliable one is S ZC_E 2_1 . For example, at k E = 9 and t = 12 h, the value of the reliability function P ZC_E 4_1 is 1.02 times larger than the value of the reliability function P ZC_E 3(FCC)_1 (0.94358 against 0.92764) and 1.03 times larger than the value of the reliability function P ZC_E 2_1 (0.94358 against 0.92764) (Figure 15).

Description of SM as a Multi-State System
In the emergency mode, an SM utilizing some VAMFs and the Sen_ZC as a source of information can be in more than one operable state; in other words, it can be considered an MSS [70]. Graphical depictions of operable states of an SM utilizing VAMF2 (system S ZC_E VAMF2 ), VAMF3 (S ZC_E VAMF3 ), and VAMF4 (S ZC_E VAMF4 ) are shown further in Figures 23, 24, and 25, respectively.
In Figures 16-18, FSM is a set of monitoring functions performed by the SM.
The system S ZC_E VAMF2 (Figure 16) has two operable states (the fully operable state (L1) and the partially operable state (L2)), while both S ZC_E VAMF3 ( Figure 17) and S ZC_E VAMF4 (Figure 18) have free operable states (L1, L2, and the partially operable state, L3). Each state can be characterized by an RBD comprising binary-state channels. The operable and non-operable states of the channels are shown in white and gray, respectively.

Description of SM as a Multi-State System
In the emergency mode, an SM utilizing some VAMFs and the Sen_ZC as a source of information can be in more than one operable state; in other words, it can be considered an MSS [70]. Graphical depictions of operable states of an SM utilizing VAMF2 (system _ ), VAMF3 ( _ ), and VAMF4 ( _ ) are shown further in Figures 23, 24, and 25, respectively.
In Figures 16-18, FSM is a set of monitoring functions performed by the SM. The system _ ( Figure 16) has two operable states (the fully operable state (L1) and the partially operable state (L2)), while both _ ( Figure 17) and _ ( Figure  18) have free operable states (L1, L2, and the partially operable state, L3). Each state can be characterized by an RBD comprising binary-state channels. The operable and non-operable states of the channels are shown in white and gray, respectively.  In the emergency mode, an SM utilizing some VAMFs and the Sen_ZC as a source of information can be in more than one operable state; in other words, it can be considered an MSS [70]. Graphical depictions of operable states of an SM utilizing VAMF2 (system _ ), VAMF3 ( _ ), and VAMF4 ( _ ) are shown further in Figures 23, 24, and 25, respectively.
In Figures 16-18, FSM is a set of monitoring functions performed by the SM. The system _ ( Figure 16) has two operable states (the fully operable state (L1) and the partially operable state (L2)), while both _ ( Figure 17) and _ ( Figure  18) have free operable states (L1, L2, and the partially operable state, L3). Each state can be characterized by an RBD comprising binary-state channels. The operable and non-operable states of the channels are shown in white and gray, respectively.

Reliability Models
Based on Figure 16, the probabilities of the system _ being at the given states can be calculated using Equations (17)- (19).
where _ _ is the probability of the system _ being at state L1, is the probability of the system _ being at state L2, where _ _ is the probability of the system _ being at state L2 or above. Based on Figure 17, the probabilities of the system _ being at the given states can be calculated using Equations (20)- (24).
where _ _ is the probability of the system _ being at state L1.
where _ _ is the probability of the system _ being at state L2.
where _ _ is the probability of the system _ being at state L3.
is the probability of the system _ being at state L2 or above.
where _ _ is the probability of the system _ being at state L3 or above. Based on Figure 18, the probabilities of the system _ being at the given states can be calculated by Equations (25)- (29).
where _ _ is the probability of the system _ being at state L1.

Reliability Models
Based on Figure 16, the probabilities of the system S ZC_E VAMF2 being at the given states can be calculated using Equations (17)- (19).
where P ZC_E VAMF2_L1 is the probability of the system S ZC_E VAMF2 being at state L1, P E ZC_CR = P E ComZC_CR P Prc_CR P Dtw_CR , P E ZC_ECR = P E ComZC_ECR P Prc_ECR P Dtw_ECR .
where P ZC_E VAMF2_L2 is the probability of the system S ZC_E VAMF2 being at state L2, P E ZC_PCG = P E ComZC_PCG P Prc_PCG P Dtw_PCG , P E ZC_FCC = P E ComZC_FCC P Prc_FCC P Dtw_FCC .
where P ZC_E VAMF2_≥L2 is the probability of the system S ZC_E VAMF2 being at state L2 or above. Based on Figure 17, the probabilities of the system S ZC_E VAMF3 being at the given states can be calculated using Equations (20)- (24).
where P ZC_E VAMF3_L1 is the probability of the system S ZC_E VAMF3 being at state L1.
where P ZC_E VAMF3_L2 is the probability of the system S ZC_E VAMF3 being at state L2. (22) where P ZC_E VAMF3_L3 is the probability of the system S ZC_E VAMF3 being at state L3. where P ZC_E VAMF3_≥L2 is the probability of the system S ZC_E VAMF3 being at state L2 or above.
where P ZC_E VAMF3_≥L3 is the probability of the system S ZC_E VAMF3 being at state L3 or above. Based on Figure 18, the probabilities of the system S ZC_E VAMF4 being at the given states can be calculated by Equations (25)- (29).
where P ZC_E VAMF4_L1 is the probability of the system S ZC_E VAMF4 being at state L1.
where P ZC_E VAMF4_L2 is the probability of the system S ZC_E VAMF4 being at state L2. (27) where P ZC_E VAMF4_L3 is the probability of the system S ZC_E VAMF4 being at state L3.
where P ZC_E VAMF4_≥L2 is the probability of the system S ZC_E VAMF4 being at state L2 or above.

Simulation and Analysis
For the simulation, the system S ZC_E VAF3 (Figure 17) was chosen. Using Equations (17)-(29), some dependencies were obtained (Figures 19-21) where the initial data were the same, as they were used for obtaining the dependencies presented in Section 3.1.
where _ _ is the probability of the system _ being at state L2.
where _ _ is the probability of the system _ being at state L3.
is the probability of the system _ being at state L2 or above.

Simulation and Analysis
For the simulation, the system _ ( Figure 17) was chosen. Using Equations (17)- (29), some dependencies were obtained (Figures 19-21) where the initial data were the same, as they were used for obtaining the dependencies presented in Section 3.1.    The analysis of the dependencies obtained allowed us to draw the following conclusions. For = 12 (see Figure 19):  (Figure 22). This structure is a special case of the type-T2 SM structure. The difference is that the IoD SM utilizes one UC only. The The analysis of the dependencies obtained allowed us to draw the following conclusions. For k E = 12. (see Figure 19): • The increase in the operating t time from 0 to 15 leads to a decrease in the probabilities of P ZC_E VAMF3_≥L3 , P ZC_E VAMF3_≥L2 , and P ZC_E VAMF3_L1 by 1.19 (from 1 to 0.83865), 1.20 (from 1 to 0.83419), and 1.21 (from 1 to 0.81969), respectively; • At t = 15 h, the probability of P ZC_E VAMF3_≥L3 is 1.01 times larger than the probability of P ZC_E VAMF3_≥L2 , (0.83865 against 0.93633) and 1.02 times larger than the probability of P ZC_E VAMF3_L1 (0.83865 against 0.81969); • At t = 11 h, the function of P ZC_E VAMF3__L2 (t) has a maximum equal to 0.185 and, at t = 9 h, the function P ZC_E VAMF3__L3 (t) has a maximum equal to 0.008. For k E = 3 (see Figures 20 and 21): • The increase in the operating time t from 0 to 15 leads to an increase in the probabilities of P ZC_E VAMF3_l2 and P ZC_E VAMF3_L3 from 0 to 0.00945 and from 0 to 0.00333, respectively; • At t = 15 h, the probability of P ZC_E VAMF3_L2 is 2.84 times larger than the probability of P ZC_E VAMF3_L3 (0.00466 against 0.00124); • At t = 9 h, the probability of P ZC_E VAMF3_L2 is 3.77 times larger than the probability of P ZC_E VAMF3_L3 (0.00945 against 0.00333).

Structure of IoD SM
Let us consider the structure of a drone fleet and an IoD-based industrial facility monitoring system utilized in emergency mode ( Figure 22). This structure is a special case of the type-T2 SM structure. The difference is that the IoD SM utilizes one UC only. The most vulnerable part of the IoD SM is the ComZC_FCC (a component of the FCC channel) [71,73,74,76,77] comprising drones for transmitting monitoring information from the ZC to the FCC. Thus, methods aimed at increasing the reliability of the ComZC_FCC require consideration.
Assume that drones have an exponential time to failure. In this case, the reliability function for the ComZC_FCC can be written, considering [78], as

Reliability Model of FCC Channel
To improve the reliability of the ComZC_FCC, a structure of type 'k-out-of-n' [73], was proposed (see the RBD in Figure 23).
Assume that drones have an exponential time to failure. In this case, the reliability function for the ComZC_FCC can be written, considering [78], as A ComZC_FCC with such a structure consists of n = 6 identical drones (DrnZC_FCC1, DrnZC_FCC2, . . . , DrnZC_FCC6) including four (k = 4) main drones (DrnZC_FCC1, DrnZC_FCC2, . . . , DrnZC_FCC4) and four (n − k = 6 − 4 = 2) standby (redundant) drones (DrnZC_FCC5 and DrnZC_FCC6). The ComZC_FCC remains in an operable state until three (n − k + 1 = 6 − 4 + 1 = 3) drones have failed. The ComZC_FCC can be considered as a series system with two (n − k = 6 − 4 = 2) redundant drones, each of which can replace any one of the failed operating drones. Assume that drones have an exponential time to failure. In this case, the reliability function for the ComZC_FCC can be written, considering [78], as where λ dr is the failure rate of the drone. Thus, the reliability function of the FCC channel can be calculated as: Using Equation (31)    The analysis of the dependencies obtained allowed us to draw the following conclusions. For λ dr = 0.001 1 h ( Figure 24): • At t = 9 h, t = 12 h, and t = 15, the increase in the emergency coefficient k E from 1 to 9 leads to a decrease in the value of the reliability function P E FCC from 0.99904 to 0.97100, from 0.99904 to 0.94181, and from 0.99824 to 0.90306, respectively; • At k E = 9, the value of the reliability function P E FCC at t = 9 h is 1.03 times larger than P E FCC at t = 12 h (0.97100 against 0.94181) and 1.08 times larger than P E FCC at t = 15 h (0.97100 against 0.90306).
For t = 12 h ( Figure 25): • At t = 9 h, t = 12 h, and t = 15, the increase in the emergency coefficient k E from 1 to 9 leads to a decrease in the value of the reliability function P E FCC from 0.99904 to 0.97100, from 0.99904 to 0.94181, and from 0.99824 to 0.90306, respectively; • At k E = 9, the value of the reliability function P E FCC at t = 9 h is 1.03 times larger than P E FCC at t = 12 h (0.97100 against 0.94181) and 1.08 times larger than P E FCC at t = 15 h (0.97100 against 0.90306).

Principles and Structure
The equipment monitoring system of one industrial enterprise in Ukraine demonstrates the effect of the use of Industry 4.0 in KOEEBOX devices [72,79]. The KOEEBOX device is located in the middle of the power line and performs analyses of the electricity consumption dynamics. After receiving the data from the electricity grid, the device transmits them to the appropriate edge node for processing and subsequent aggregation and viewing of the PCG or CR. Viewing is possible through a special application for the KOEEBOX device. Table 5 shows the possibility of using the KOEEBOX device as a means of monitoring the different components and levels of the SM as a whole. The digital twin technology with the IoT establishes a connection between the equipment and the CR. Additionally, it can be connected with cloud applications (PCG level). The energy performance monitoring device is available in the proposed equipment monitoring system. This device must be located in the middle of the power line so that it passes power to the end device. Hence, it performs power difference analysis and obtains status and usage statistics. The device transmits the data, after receiving them from the facilities, to the corresponding web server for further viewing and aggregation. Viewing is possible through a special system application. Figure 26a shows a block diagram explaining the monitoring of the energy consumption using sensors (ECS) for equipment verification (EC1-ECn). The equipment monitoring system of one industrial enterprise in Ukraine demonstrates the effect of the use of Industry 4.0 in KOEEBOX devices [72,79]. The KOEEBOX device is located in the middle of the power line and performs analyses of the electricity consumption dynamics. After receiving the data from the electricity grid, the device transmits them to the appropriate edge node for processing and subsequent aggregation and viewing of the PCG or CR. Viewing is possible through a special application for the KO-EEBOX device. Table 5 shows the possibility of using the KOEEBOX device as a means of monitoring the different components and levels of the SM as a whole. The digital twin technology with the IoT establishes a connection between the equipment and the CR. Additionally, it can be connected with cloud applications (PCG level). The energy performance monitoring device is available in the proposed equipment monitoring system. This device must be located in the middle of the power line so that it passes power to the end device. Hence, it performs power difference analysis and obtains status and usage statistics. The device transmits the data, after receiving them from the facilities, to the corresponding web server for further viewing and aggregation. Viewing is possible through a special system application. Figure 26a shows a block diagram explaining the monitoring of the energy consumption using sensors (ECS) for equipment verification (EC1-ECn). In the proposed equipment energy-efficiency monitoring system, all operations are performed in real time both on the client side and on the device. The cloud part of the equipment energy-efficiency monitoring system should consist of three web services: the In the proposed equipment energy-efficiency monitoring system, all operations are performed in real time both on the client side and on the device. The cloud part of the equipment energy-efficiency monitoring system should consist of three web services: the storage service; authentication service; and maintenance of client applications. Each service must run in its own isolated space and they must communicate with each other over the Internet. For the development of all services, it is suggested that one should use the JavaScript/TypeScript programming language, as well as WebSocket and HyperText Transfer Protocol (HTTP). The PostgreSQL database management system was chosen as the storage environment. All services must be designed according to the ECMAScript6 standard and SOLID design principles. Additionally, it is necessary to have an authorization mechanism for JavaScript Object Notation Web Token (JWT) [80].

Processes and Algorithms of Monitoring
The storage service receives data from clients using the WebSocket protocol in real time, and also processes requests for statistical data on changes in energy efficiency over a certain period. Figure 26b shows the scheme of integration of the storage service into the overall system of monitoring the energy efficiency of the equipment. In this scheme, the storage service will act as a "digital twin" of the connected device and also reproduce the operation of the device in digital format. This component will have three interfaces for integration: The WebSocket gateway for devices will be used to constantly "talk" to the device and maintain it, and in the future, to control the device through this gateway. The WebSocket gateway for clients is required to instantly transmit data received from a device to the end client, as well as for future device management. The API gateway for HTTP REST for clients will act as an accessible and easy-to-use statistics-generation interface, primarily required for the client application service, and there is a prospect of its integration with other services through this channel.
There are three main algorithms for a system for equipment monitoring. The first algorithm corresponds to the behavior of the service when the client interacts with the provided API gateway HTTP REST. This algorithm is classic for client-server architecture with built-in authorization checks [81], so all private resources will contain middleware that, in case of a mismatch, will return the corresponding error when performing the authorization check action of incoming requests. The second algorithm corresponds to the behavior of the service when the client interacts with the provided WebSocket gateway. The first three steps of the algorithm describe starting the program and reading the configuration files. The service stores the ID of the connected and authorized socket and waits for new data from the client-monitoring object named "device" after it verifies that the connected socket was authorized successfully. This socket ID save action is necessary to group sockets from a single client for the further filtering of client lookups and data transfers. The next algorithm handled by the storage service is the algorithm that matches the behavior of the service when the device communicates with the provided WebSocket gateway. The first steps remained the same as in the previous two algorithms since these interfaces work in the same service. The next step, "Checking object data", corresponds to checking the state of the corresponding device, which will be received from the authentication service via the API gateway HTTP REST. If the check is successful, the service stores the connected socket ID and waits for new data. The service performs a data validation process when the device sends data and, if successful, stores the data in the PostgreSQL repository; then, it forwards the data to the appropriate client sockets. Data from the device will be ignored if the verification fails [82,83].

Experiment Results
Two types of system behavior were tested as an experimental task: 1.
The display of energy characteristics in real time; 2.
The display of energy characteristics in the past.
For the first experiment, a device emulator was used, which sent energy data to the system once every second. The result of this experiment has been described in [72].
For the second experiment, a device emulator was used for an hour, which also sent data every second. Figure 27 shows the result of this experiment. In this figure, the chart has more points due to the larger range of data displayed. Each point corresponds to the two-minute mean value of each parameter. Two types of system behavior were tested as an experimental task: 1. The display of energy characteristics in real time; 2. The display of energy characteristics in the past.
For the first experiment, a device emulator was used, which sent energy data to the system once every second. The result of this experiment has been described in [72].
For the second experiment, a device emulator was used for an hour, which also sent data every second. Figure 27 shows the result of this experiment. In this figure, the chart has more points due to the larger range of data displayed. Each point corresponds to the two-minute mean value of each parameter. This system is described by the RBD in Figure 4.

Discussion
The distinctive feature of the proposed methodology is that of ensuring the dependability of systems on its developed base, which is achieved by combining the redundancy and diversity of various components and subsystems; these include sensors, means of communication and data processing, digital duplication formation, and decision support centers. The use of diversity reduces the risk of common cause failures, which can be caused by the influence of the external environment and the accumulation of failures, as well as cyberattacks.
The proposed SM structures, in comparison with existing works [53,54,63,71,73,76,84], provide higher reliability, as assessed using the reliability function (see Section 3.3). Additionally, these structures increase the survivability of the SM as an MSS (see Section 3.4). The mentioned benefits are achieved due to utilization of some kinds of redundancy and reconfigurability.
In addition, different structures of monitoring systems are proposed, depending on options for forming the models of digital duplicates using decision-making centers (СР, ECR, PCG, and FCC). It helps to select such structures, taking into account requirements for reliability and dependability, as well as characteristics of industrial facilities. This system is described by the RBD in Figure 4.

Discussion
The distinctive feature of the proposed methodology is that of ensuring the dependability of systems on its developed base, which is achieved by combining the redundancy and diversity of various components and subsystems; these include sensors, means of communication and data processing, digital duplication formation, and decision support centers. The use of diversity reduces the risk of common cause failures, which can be caused by the influence of the external environment and the accumulation of failures, as well as cyberattacks.
The proposed SM structures, in comparison with existing works [53,54,63,71,73,76,84], provide higher reliability, as assessed using the reliability function (see Section 3.3). Additionally, these structures increase the survivability of the SM as an MSS (see Section 3.4). The mentioned benefits are achieved due to utilization of some kinds of redundancy and reconfigurability.
In addition, different structures of monitoring systems are proposed, depending on options for forming the models of digital duplicates using decision-making centers (CP, ECR, PCG, and FCC). It helps to select such structures, taking into account requirements for reliability and dependability, as well as characteristics of industrial facilities.
The set of proposed SM reliability models (Section 3) provides possibilities for the high-level quantitative calculation of indicators, the comparison of structure options, and a choice regarding their use. Moreover, a simulation was performed for selected values of the component reliability parameters. This makes it possible to determine the appropriate areas of the application, taking into account requirements and constraints, as well as the growth of failure rates due to the action of accident factors. One of the features of the proposed models is that they take into account the deterioration of monitoring system channels (at the ZC level) due to failures, and are presented as multi-state systems. The reliability of IoT devices that provide data for the DTs is taken into account in the SM reliability models. Their failure rate and battery life generally depend on the operation modes (including the modes of their possible periodic transfer to sleep mode). These modes are described by the corresponding coefficients when calculating the failure rates. Therefore, the selection of IoT devices is carried out while taking into account the requirements for SM reliability according to the proposed analytical dependencies.
The two examples for the application of monitoring systems complement each other according to the hierarchy levels of their representation in industrial facilities. For the enterprise equipment monitoring level (the first level of the industrial facility hierarchy), the first example of an equipment monitoring system based on the analysis of energy consumption dynamics is given. It can complement the traditional control and management methods in a fairly simple way, as well as form an additional component of the digital twin. This paper represents a significant extension of [72] with deeper details regarding the hierarchical building of monitoring systems.
The second example relates to more complex monitoring systems, which are part of Post-Accident Monitoring System and Automated Radiation Monitoring System. They are implemented using the UAV/FCC mobile subsystem and the PCG private cloud environment. The FCC subsystem redundancy option illustrates how its reliability can be increased and the composition of the UAV fleet can be selected.

Conclusions
The solving of research tasks enables us to make decisions about the construction and modernization of monitoring systems for complex industrial facilities, which increase their dependability and safety.
The results obtained are the basis for further developing the theory and practical applications of monitoring systems for industrial facilities within the framework of the implementation and improvement of Industry 4.0 principles.
In our opinion, the following areas of research and development are important and promising: • The development of ontological models describing the intelligent UAV-and IoT-based monitoring systems in various situations and environmental conditions; • The development of different separate and joint digital twin models for centers of decision making and for the implementation for optimal procedures of critical object recovery; • The development and research of SM dependability models considering the extended taxonomy of hardware and software faults, recovery procedures and the location of the SM, as well as automated battery maintenance systems [84]; • The consideration of the cybersecurity aspects of the Internet of Drones for subsystem FCC and subsystem PCG for the assessment and assurance of SM reliability [85].

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Acknowledgments:
The authors appreciate the scientific staff of the Department of Computer Systems, Networks and Cybersecurity of the National Aerospace University "KhAI", and the Research Institute for Intelligent Computer Systems, West Ukrainian National University for providing invaluable inspiration, hard work, and creative analysis during the preparation of this paper.

Conflicts of Interest:
The authors declare no conflict of interest.