A Self-Optimizing Technique Based on Vertical Handover for Load Balancing in Heterogeneous Wireless Networks Using Big Data Analytics

: With the heterogeneity and collaboration of many wireless operators (2G/3G/4G/5G/Wi-Fi), the priority is to effectively manage shared radio resources and ensure transparent user movement, which includes mechanisms such as mobility support, handover, quality of service (QoS), security and pricing. This requires considering the transition from the current mobile network architecture to a new paradigm based on collecting and storing information in big data for further analysis and decision making. For this reason, the management of big data analytics-driven networks in a cloud environment is an urgent issue, as the growth of its volume is becoming a challenge for today’s mobile infrastructure. Thus, we have formalized the problem of access network selection to improve the quality of mobile services through the efﬁcient use of heterogeneous wireless network resources and optimal horizontal–vertical handover procedures. We proposed a method for adaptive selection of a wireless access node in a heterogeneous environment. A structural diagram of the optimization stages for wireless heterogeneous networks was developed, making it possible to improve the efﬁciency of their functioning. A model for studying the processes of functioning of a heterogeneous network environment is proposed. This model uses the methodology of big data evaluation to perform data transmission monitoring, analysis of tasks generated by network users, and statistical output of vertical handover initiation in (2G/3G/4G/5G/Wi-Fi) mobile communication infrastructure. The model allows studying the issues of optimization of operators’ networks by implementing the algorithm of redistribution of its network resources and providing ﬂexible load balancing with QoS users in mind. The effectiveness of the proposed solutions is evaluated, and the performance of the heterogeneous network is increased by 16% when using the method of static reservation of network resources, compared to homogeneous networks, and another 13% when using a uniform distribution of resources and a dynamic process of their reservation, as well as compared to the previous method. An appropriate self-optimizing technique based on vertical handover for load balancing in heterogeneous wireless networks, using big data analytics, improves the QoS for users.


Background and Problem Statement
Today, the volume of mobile traffic is growing rapidly due to the total spread of a variety of mobile devices [1]. The main volume of network traffic is mobile video on the internet, social media and popular services of the Internet of Things. Therefore, a solution is needed that will enable the operator to move to a centralized and flexible heterogeneous

•
The problem of ensuring the effective functioning of a heterogeneous radio access network is formalized; • The method of increasing the efficiency of functioning of heterogeneous mobile communication networks based on big data technology is developed; • The realization of technologies for processing big data volumes, obtained by simulating the process of functioning of a heterogeneous network is carried out; • The assessment of the effectiveness of the proposed solutions in relation to the optimization problem of the resources of a heterogeneous network of mobile communication is carried out.
The remainder of this paper is organized as follows: Section 2 presents the related works; Section 3 presents the proposed solution in the paper, including the description of the self-optimizing technique based on vertical handover for load balancing in hetero-Appl. Sci. 2021, 11, 4737 3 of 24 geneous wireless networks, using big data analytics; Section 4 presents the experimental results; and Section 5 presents the conclusions of the study.

Related Work
The classical vertical handover mechanism in heterogeneous network selection aims to choose the optimal network solution for the user; however, this may lead to partial networks accessing too many users, overloading the network and influencing the QoS of the customer. The network load-balancing approach presented in the paper [17] transforms the network-balancing problem into an optimization problem by constructing a network allocation matrix in the network that meets the user's needs. The optimal allocation method is then obtained, using the optimization algorithm to effectively reach the balanced network utilization. In addition, this method is used to weight different networks to provide QoS requirements of different services. The modeling results demonstrate the effectiveness of the suggested algorithm. The proposed approach is a generic algorithm that can be applied to various heterogeneous networks, such as public wireless networks.
Su et al. [18] adopted a comparison method that compares two parameters, namely, cell boundary crossings and handover execution, to optimize the overall network performance. The handover decision on a target cell is completely dependent on the signal strength measurement. Saeed et al. [19] developed a model to optimize the handover algorithm based on fuzzy logic for heterogenous network.
In paper [20], the authors proposed a heterogeneous network handover based on a multiple attribute decision based on Technique for Order Preference by Similarity to Ideal Solution (TOPSIS). The method benefits in ordering by similarity to the ideal solution TOPSIS. Base stations are considered alternatives, and transmission metrics are treated as attributes for selecting the appropriate base station for transmission. In this paper, the authors proposed two modified TOPSIS methods for the purposes of transmission control in a heterogeneous network. The first method incorporates an entropy weighting technique for the transmission weighting metric. The second proposed method uses a standard deviation weighting technique to estimate the importance of each transmission metric. The simulation results show that the proposed methods outperform existing methods by reducing the number of frequent transmissions and radio failures, in addition to improving the average throughput of users.
In current mobile networks, performance management (PM) is collected from network elements to a centralized system, the Network Management System (NMS), which acts as a business intelligence tool that specializes in monitoring and reporting on network performance. The performance management files include metrics and the named counters used to quantify network performance. Current NMS implementations have limitations in scalability and support for the volume, variety and speed of collected PM data, especially for 5G and 6G mobile network technologies. In order to overcome these limitations, the authors in [21] developed a big data framework based on an analysis of the following components: software architecture, data transmission, processing, reporting, and deployment. The authors also analyzed the PM file format on a real dataset from four different vendors and 2G, 3G, 4G, and 5G technologies. They then evaluated experimentally the appropriateness of the proposed framework, using a case study, including 5G PM files. The test results of the components and reports are presented, defining the hardware and software required to support up to one billion meters per hour. This suggestion can help telecom providers to adopt a big data reference system to address current and future calls on new mobile networks.
The paper [22] explored a new fog computing system design to support device communication for (QoS) and Quality of Experience (QoE) enhancements. In particular, the authors focused on the potential of the fog computing orchestration system. How it can be adapted to next-generation cellular systems is an open task for research. The authors further proposed a mobility management procedure for fog networks, considering static and dynamic mobile nodes. As a result of the study, it was found that the proposed work has the lowest power consumption, latency, latency, and signaling cost compared to LTE/LTE-A networks.
In the paper [23], the authors identified the opportunities that 5G networks can provide and discussed the major challenges associated with implementing and achieving 5G goals. They also discussed recent advances in standardization, architectures that may be potential candidates for deployment, and energy challenges in 5G networks. Finally, the paper presented a big data perspective and the potential of machine learning for optimization and decision support in 5G networks.
In the process of analysis of scientific works in the direction of development of heterogeneous mobile communication networks, it was found that the rapid growth of the volume of customer traffic in mobile networks and changes in its nature and structure require a continuous and significant increase in the capacity of these systems. Radio interface technologies practically reach the theoretical limits of channel bandwidth, so the further way to increase network capacity is spatial compaction and improvement of radio resource allocation management methods. To achieve greater network performance, it is proposed to use heterogeneous networks with cells of different sizes. A number of technical problems to be solved in heterogeneous networks are considered, namely, network planning, fighting inter-system interference, transport network organization, network management and its self-organization, mobility management, and the like. In addition, the strategic directions of software-defined networking/network functions virtualization (SDN/NFV) and cloud technologies are considered; the first is associated with increased network efficiency and service flexibility, while the second aims to take advantage of a combination of new business opportunities [24][25][26][27][28].
From the analysis of the work, it follows that the existing methods of improving the efficiency of mobile networks face the problem of a lack of technologies for managing heterogeneous networks, which would allow creating a flexible, manageable, adaptive and cost-effective system with prediction of the load from users and a focus on user satisfaction.

A Future Generation Heterogeneous Wireless Network Based on Big Data Analytics
Today, progressive development in the field of telecommunications leads to the creation of various radio access technologies and the emergence of a significant number of user devices supporting different mobile standards, which, in the near future, will combine different technologies into a single converged network and create a global heterogeneous mobile network [29]. This network will be formed of different segments of wireless technologies in which coverage areas are superimposed. This will increase the bandwidth of the network, expand its coverage area, and provide services to the user with better quality at no cost [30]. In heterogeneous, next-generation wireless networks, a user with a universal terminal will be able to access the networks of different telecommunications operators/providers. There is an urgent scientific and technical task for finding new methods for managing components of cellular networks and optimizing the parameters of fragments of cellular networks of different generations (2G-5G) and the subsystem of switching services in conditions of high user activity and the resulting congestion [31].
The mechanisms of the handover solution or the control of the switching between the communication channels can be centralized [the decision on the handover can be made in the user equipment (UE) itself (as in a mobile Wireless Local Area Network or in a network entity (e.g., cellular voice transmission))]. These cases are called Mobile-Controlled Handoff (MCHO) and Controlled-Network Handoff (NCHO) [32].
In the NCHO, the network makes a transmission decision based on the measurement of the RSS UE on a number of base stations. Signal quality information for all users is available at one point on the network, which facilitates the appropriate allocation of resources. This is advantageous when the handover decision is made by the network due to the following reasons:

•
The network may redirect the UE to another network that has sufficient capacity to process its current communications; • The network can also coordinate the mobility of all UEs so that total traffic is evenly distributed across all resources, congestion is minimized, and total bandwidth is maximized.
The disadvantage of NCHO is that the radio network may lack some parameters that affect transmission decisions, such as user requirements, the exact type of service, the number of active UEs, and some operator policies related to mobility between mobile Wi-Fi and 3GPP.
Smart integration of Wi-Fi as part of the operator network provides significant benefits in terms of capacity and coverage, especially where people gather most often-modes of public transport, shopping malls, city centers, and more. Intelligent integration involves network selection and authentication of the operator who owns the Wi-Fi automatically and securely, while ensuring reliable and high-quality services. Integrated Wi-Fi networks will provide operators with more control and visibility when using Wi-Fi, as well as the ability to enforce common policies (as in 3G/4G networks).
Operators are turning to the integration of Wi-Fi as an alternative radio access technology (RAT) to add capacity and to provide value-added services. There are several important benefits for a heterogeneous architecture over end-to-end QoS.
In the MCHO, the UE can fully control the handover process. This type of handover has a short response time (about 0.1 s). The UE itself first detects all available networks. It then measures the signal levels from the surrounding base station (BS) and the interference levels on all channels and makes all the necessary estimates to address the handover. Transmission is initiated if the signal intensity of the serving BS is lower than that of another BS using a certain threshold.
MCHO is the choice of the future as the telecommunications market migrates from a centralized operator approach to a customer-centric approach. The network-assisted handoff (NAHO) assists the UE in deciding to initiate handover by collecting and analyzing data. The UE may also provide its location and any other information that may be considered in the network analysis. The network only assists the UE in the adoption process, and the final decision will be made by the UE.
As mentioned earlier, all information collected by the UE will be sent to the data center through the mobile operator. The data center is represented as big data, consisting of n servers. Big data opens the possibility of flexible interaction of terminal devices, both with the data center and with each other. This is achieved by the built-in drivers and system software of the cloud computing system. Physical switching will be performed by the UE, based on decision making by analyzing statistics in big data [33]. This will allow the monitoring of data transmission processes, the analyzing of any tasks, and, eventually, displaying the necessary types of reports on switching or initiating a handover.
Accordingly, we proposed the architecture of the management system of the heterogeneous wireless network of the future generation in the development of 5G, which includes various radio access technologies with centralized management and the processing of large amounts of data. An important role in the flexibility of resource management of a heterogeneous network is played by peculiarities of the functioning of big data and collected statistical information from the characteristics of users and operators of the network. Thus, in the work, it is expedient to carry out research of these features of functioning with the help of technology big data, which are given in the following subsection.
The proposed architecture of a future generation heterogeneous wireless network using big data for vertical handover management is depicted in Figure 1. In a heterogeneous mobile system, data are conventionally divided into two types: user data, and network operator performance data. The analysis of both types of data can provide valuable information that can be used to optimize the network. Thus, mobile operators can analyze data to perform network planning, spectrum allocation, resource management, and the like. The data collected from UEs are very related to the user profile and behavior, including their location and mobility and personal data about the user's QoS/QoE needs [34]. With the rapid expansion of the mobile network and the development of smart mobile devices, an excessive amount of data are generated from the applications installed on UE users. The data collected by the operators are mainly obtained from their database registers and RANs. Database registers have a large amount of service data regarding network performance, successful call information and service usage priorities. Cloud Radio Access Network (C-RAN) is a novel mobile network architecture that is deemed to be one of the most promising evolution trends for 5G networks [35].
Appl. Sci. 2021, 11, x FOR PEER REVIEW 6 of 26 lected statistical information from the characteristics of users and operators of the network. Thus, in the work, it is expedient to carry out research of these features of functioning with the help of technology big data, which are given in the following subsection.
The proposed architecture of a future generation heterogeneous wireless network using big data for vertical handover management is depicted in Figure 1. In a heterogeneous mobile system, data are conventionally divided into two types: user data, and network operator performance data. The analysis of both types of data can provide valuable information that can be used to optimize the network. Thus, mobile operators can analyze data to perform network planning, spectrum allocation, resource management, and the like. The data collected from UEs are very related to the user profile and

Radio Access Network Selection Optimization in Heterogeneous Wireless Networks
We formalized the problem of access network selection to improve the quality of mobile services through the efficient use of heterogeneous wireless network resources and optimal load balancing vertical handover procedures.
Let a closed space Ω be given in which a wireless heterogeneous network (or heterogeneous network consists of S radio access networks and a set of connections n = (1, 2, . . . , N), each consisting of individual wireless stations with different characteristics: where S is heterogeneous base stations or access points in a heterogeneous environment, and i is the number of stations. Ω p space of coverage by wireless stations, consists of the following: Each wireless station S i ∈ S has a number of characteristics: where χ(S i ) is the characteristics of base stations or access points, and n is the total number of characteristics.
In the space Ω under the influence of the set of wireless communication stations (CV) there exists a set of mobile devices functioning in roaming mode (without loading) or performing service operations. Let these mobile devices be as follows: where m is the number of mobile devices, f is the position of the mobile device.
Each mobile device D f ∈ D has a number of characteristics: where η f is the characteristics of the mobile device D f .
where S 0 (D f ) mobile device is not working (not enabled, faulty), S F (D f ) mobile device is working in free mode (does not transmit data), and S R (D f ) mobile device is working (transmits data). Such information is necessary for the system of analysis of user activity in assessing the state of the network in the process of load balancing. Let us introduce a set of information processes (BP) to be executed in Ω space or to maintain their execution BP = {BP i : i = 1, . . . , L}, for example, to provide different information services (video, call, conference, cloud services). Consider the case of a single service process. Then, a single process BP consists of the following work operations: where k is the number of work operations (the value of this variable is defined by the task setter); R oi is a work operation, included in the process. A work operation is an action that must be performed by a network node (server, router, switch) or a mobile device for processing information data. Certain requirements are imposed to perform a skin work operation: • Conditions on the mobile device side are required, that is, a number of conditions, constraints, and criteria are imposed on the choice of devices to perform and assign work operations. We denote all these requirements by the following: where g is the total number of requirements and conditions, the value of which is determined by the task setter; V i is the i-th requirement determined by the operator. • A number of requirements are imposed on the quality of functioning of a work operation: where h is the number of requirements and performance criteria (support for the process of performing and the result of performing a work operation). The value of h (the number of requirements and criteria) is determined by the problem setter. W i are the requirements for the execution of a work operation: responsiveness, minimum cost, and so on.
Let us formulate a general statement of the optimization problem of efficient, liquidfree communication.
At time t (t n ), the state of the space environment Ω is as follows (i.e., the situation is decision-making): where CV is the wireless communication station. Then at time t n it is necessary to choose such a wireless communication station CVi ∈ CV that satisfies the requirements of the function presented below.
where W = W 1 , W 2 , . . . W j , j < h, further by expression (11), we turn the function into a function of the following form: Function (12) is the criterion to be maximized (throughput, QoS, transmission rate, relative to cost or a certain complex criterion). Each network has a finite radio resource P i . When mobile device i is allocated to network j, it uses its resource r ij ; r ij min is the minimum resource needed to satisfy the QoS requirements of the user, b ij is a binary variable that is equal to 1 if mobile device i is allocated to network j, and otherwise is equal to 0. If all resources of the network are busy and new requests are received, the allocation of resources between users is carried out according to a specific policy, which is represented by the function p ij (v j , P j , D j ). It can depend on the total network capacity (v j ), the number of mobile devices (D j ) and the vector of QoS requirements for all connections (P j ). Based on the above, we formulate a general statement of the optimum network selection problem for horizontal-vertical handover as an objective function: Thus, instead of solving problem (11), we solve problem (13), To solve this target function, we proposed an algorithm for optimizing a heterogeneous network using big data technology, which is shown below. The proposed scheme allows the optimization of the infrastructure to use both user and network data, which, in turn, will increase the efficiency of the network as a whole. The structure of the network required for the algorithm, in simplified form, is shown in Figure 2.
To solve this target function, we proposed an algorithm for optimizing a heterogeneous network using big data technology, which is shown below. The proposed scheme allows the optimization of the infrastructure to use both user and network data, which, in turn, will increase the efficiency of the network as a whole. The structure of the network required for the algorithm, in simplified form, is shown in Figure 2. A heterogeneous network usually consists of many cells of different technologies. Such a multi-layered network architecture can provide high capacity, provide the necessary level of quality of service. By using BD and adapting different network resources according to dynamically changing time characteristics, it is possible to improve the throughput of the whole network. In order to improve the efficiency of the network infrastructure under the increasing load on the MNOs, it is recommended to classify the user traffic requests to the necessary network resources and improve the efficiency of their distribution through the use of intelligent and analytical information based on big data.  A heterogeneous network usually consists of many cells of different technologies. Such a multi-layered network architecture can provide high capacity, provide the necessary level of quality of service. By using BD and adapting different network resources according to dynamically changing time characteristics, it is possible to improve the throughput of the whole network. In order to improve the efficiency of the network infrastructure under the increasing load on the MNOs, it is recommended to classify the user traffic requests to the necessary network resources and improve the efficiency of their distribution through the use of intelligent and analytical information based on big data. Figure 3 shows the general principle of the proposed algorithm for the optimization of a heterogeneous network, using big data. The first stage is the collection of big data. Data collection can be achieved through user equipment (UE), radio access network (RAN) and Internet Service Providers (ISP). Events occurring in the UE are collected either through user programs or through a control signal. In a radio access network RAN with eNodeB (eNB), instantaneous data measurement reports of the QoS requirements from different users, a more detailed principle The first stage is the collection of big data. Data collection can be achieved through user equipment (UE), radio access network (RAN) and Internet Service Providers (ISP). Events occurring in the UE are collected either through user programs or through a control signal. In a radio access network RAN with eNodeB (eNB), instantaneous data measurement reports of the QoS requirements from different users, a more detailed principle of collection, are given in the paper [21]. MNOs have a huge amount of data related to media/user services in a heterogeneous network. In addition, a large storage infrastructure must have scalable capacity as well as scalable performance. Thus, latency management must be simple and efficient in order to easily store and sort big data.
The second stage is to analyze the collected data. After collecting and storing the data, another major challenge for MNOs is processing such huge amounts of data. The collected data are reusable, heterogeneous, in real-time and voluminous. For this reason, data analysis and information extraction technology are needed to process the data and transform them to optimize the network. So, this information can be used to develop adaptive resource management schemes. Data analysis allows MNOs to systematically manage different access networks and provide services to customers. BD network optimization functions are able to analyze big data to identify problems and decide what and how to optimize at the appropriate level of the heterogeneous network. Improvement measures based on optimization results are then implemented, using control functions in the RAN. In addition, optimization at the user level can be performed. In particular, for users who are in the same cell, optimization can be configured for each user based on the class of service (priority but not priority users). In addition, the BD network optimization function is able to predict traffic fluctuations, both in the local area and in the global coverage area, ultimately helping to improve network performance and quality of service for users.
The third stage is the management of the radio access network operator's resources. MNOs need to be informed about their long-term network deployment goals in terms of bandwidth, coverage, number and location of base stations, etc. They also require new resource allocation strategies to meet different traffic requirements across the coverage area. Thus, the use of big data analytics can be a new way to solve these problems. Network analytics involve monitoring and analyzing user statistics that enable the realtime prediction of critical points and the state of mobile networks. Based on the obtained MNOs data, intelligent decisions are made to serve users by balancing the load and prioritizing traffic to improve the efficiency of operation and provide the necessary quality of service in a heterogeneous network.
At the fourth stage, the problem of heterogeneous network optimization is solved by applying a comprehensive method, which includes the procedure of vertical handover initiation, redistribution of flows and rejection of non-priority user sessions.

A Self-Optimizing Technique Based on Vertical Handover for Load Balancing in HWN Using BD Analytics
On the basis of the formed stages of the method of increasing the performance of the heterogeneous network, a model of the heterogeneous network environment is proposed which, in contrast to the known, uses the method of processing a large amount of data to monitor the processes of information transfer, analysis of tasks and output of necessary reports on switching or initiating the handover and allows to investigate the optimization of network infrastructure operator network by implementing an algorithm for the redistribution of its network resources and balancing. To simplify understanding, we proposed a hierarchical representation of the input data in the implementation of the developed complex heterogeneous network optimization process.
In Figure 4, the hierarchical structure of the input data for the modeling of the investigated network consisting of 2G/3G/4G/5G/Wi-Fi technology, service and QoS planes is presented. of network infrastructure operator network by implementing an algorithm for the redistribution of its network resources and balancing. To simplify understanding, we proposed a hierarchical representation of the input data in the implementation of the developed complex heterogeneous network optimization process.
In Figure 4, the hierarchical structure of the input data for the modeling of the investigated network consisting of 2G/3G/4G/5G/Wi-Fi technology, service and QoS planes is presented. The technology plane includes mobile technologies of the 2G, 3G, 4G, 5G Cloud-RAN, and Wi-Fi network. For modeling, it is assumed that all technologies operate within the range of the cell 2G. Additionally, only one base station that works with the presented technologies is considered in the paper. Each of the base stations receives requests from the UE, which are divided into priority requests (shown in Figure 4, marked in red) and non-priority (marked in blue). For example, it was assumed that users may receive requests for the following types of services: • Calls as voice transmission over IP networks (Voice over IP); • Digital Interactive Television (IPTV); • Internet data (I), which includes uploading and downloading data from internet resources; • Video conferencing (Conf); • WEB.
These types of services form the plane of services. Each base station receives requests from the UE with the type of services, respectively; requests from users can form active and inactive sessions. To connect and maintain the UE, the required quality of service must be provided, which is represented in the QoS plane.
The QoS plane uses acceptable session service quality parameters according to ITU-T recommendations to ensure guaranteed quality for users.
• Throughputs (C); The technology plane includes mobile technologies of the 2G, 3G, 4G, 5G Cloud-RAN, and Wi-Fi network. For modeling, it is assumed that all technologies operate within the range of the cell 2G. Additionally, only one base station that works with the presented technologies is considered in the paper. Each of the base stations receives requests from the UE, which are divided into priority requests (shown in Figure 4, marked in red) and non-priority (marked in blue). For example, it was assumed that users may receive requests for the following types of services: • Calls as voice transmission over IP networks (Voice over IP); • Digital Interactive Television (IPTV); • Internet data (I), which includes uploading and downloading data from internet resources; These types of services form the plane of services. Each base station receives requests from the UE with the type of services, respectively; requests from users can form active and inactive sessions. To connect and maintain the UE, the required quality of service must be provided, which is represented in the QoS plane.
The QoS plane uses acceptable session service quality parameters according to ITU-T recommendations to ensure guaranteed quality for users.
For the practical implementation of the proposed big data approaches, it was decided to apply a cloud solution, namely, renting DigitalOcean cloud based on creating an account in DigitalOcean with setting up a server with Ubuntu 16.04 and installing the software needed for experimental research. The Cassandra database and the scalable Apache Spark data analysis platform were selected for the heterogeneous network, which is critical to the processing time [36]. Apache Cassandra is a scalable, fault-tolerant NoSQL database that is suitable for fast writing and reading of large amounts of unstructured data. Apache Spark is a fast and common engine for large-scale data processing. The general scheme of the heterogeneous network model using BD is depicted in Figure 5. This is followed by the import of statistics into the Cassandra database, obtained by computer simulation, using the Makaroo utility ( Figure 6) [37]. The results of the generated heterogeneous network statistics is depicted in Figure 7. count in DigitalOcean with setting up a server with Ubuntu 16.04 and installing the software needed for experimental research. The Cassandra database and the scalable Apache Spark data analysis platform were selected for the heterogeneous network, which is critical to the processing time [36]. Apache Cassandra is a scalable, fault-tolerant NoSQL database that is suitable for fast writing and reading of large amounts of unstructured data. Apache Spark is a fast and common engine for large-scale data processing. The general scheme of the heterogeneous network model using BD is depicted in Figure 5. This is followed by the import of statistics into the Cassandra database, obtained by computer simulation, using the Makaroo utility ( Figure 6) [37]. The results of the generated heterogeneous network statistics is depicted in Figure 7.   To calculate and analyze data, Jobs scripts are written in the Python programming language for Spark. When working, job-1 Spark performs a request to read data from the database Cassandra and processes them, calculating the number of active sessions on the network. The result is presented in Figure 8a. During operation, job-4 Spark simulates an active heavy load on the network. The result is presented in Figure 8b. To calculate and analyze data, Jobs scripts are written in the Python programming language for Spark. When working, job-1 Spark performs a request to read data from the database Cassandra and processes them, calculating the number of active sessions on the network. The result is presented in Figure 8a. During operation, job-4 Spark simulates an active heavy load on the network. The result is presented in Figure 8b. In order to fully run the model and quickly process the statistical data, it is necessary to ensure that the Cassandra database interacts with the Spark platform.
Next, we proceed to data analysis. Figure 9 shows the algorithm, which allows optimal management of heterogeneous network resources. The operation of the algorithm begins by writing input data to the Cassandra database. The input data include service re- In order to fully run the model and quickly process the statistical data, it is necessary to ensure that the Cassandra database interacts with the Spark platform.
Next, we proceed to data analysis. Figure 9 shows the algorithm, which allows optimal management of heterogeneous network resources. The operation of the algorithm begins by writing input data to the Cassandra database. The input data include service requests from UEs as well as data on the state of the heterogeneous network (active sessions for each access technology). After recording the statistical data, we proceed to their analysis and comparison with the maximum allowable values for each of the technologies. With the help of data analysis, the critical points in the network are evaluated and decisions are made on the connection of requests from UEs (if there are free resources in the network) arriving at a particular point in time. We use the developed mathematical method to generate the data analysis.
where Pinput.ij (t) is the number of incoming requests from users to the BS at a certain moment of time t.
lost .ij req.ij areq .ij where Plost.ij (t) is the number of rejected requests from users, for a particular type of service If a heterogeneous network is loaded, a detailed NP (Network Parameters) analysis of active sessions and requests received is performed. Then, the free resources in the heterogeneous network are calculated and compared with the required number of resources to serve incoming requests.
If the required number of resources exists, then the load is redistributed and balanced in the heterogeneous network, and signaling data are sent to each of the UEs with the optimal BS that can serve them. Otherwise, the priority of active sessions and incoming requests is analyzed. The non-priority sessions and requests are rejected and will be served later, while the priority requests are served with the required quality of service. After that, the algorithm is executed again after a time ∆t. This algorithm can actually prove that mobile communication improves the transmission success rate, so that the user can continue driving while traveling without the mobile service being interrupted.
We use the developed mathematical method to generate the data analysis.
where P areq.ij (t) is the number of requests from users, which can be served by the BS at a certain point in time, P max.ij (t) is the maximum allowable number of active sessions of users without degradation of QoS, where i is the type of technology and j is the type of service, and P cur.ij (t) is the current number of active sessions at time t.
where P input.ij (t) is the number of incoming requests from users to the BS at a certain moment of time t. P lost.ij (t) = P req.ij (t) − P areq.ij (t), where P lost.ij (t) is the number of rejected requests from users, for a particular type of service at the time t.
Based on the data obtained from the network analysis, graphs of heterogeneous network load were plotted. The analysis was performed for each technology and service separately to calculate and analyze user data Jobs scripts in the Python programming language for Spark. As an example, the results of the data analysis for overloaded 3G technology are shown in Figure 10. In the work, for flexible resource management, it is proposed to estimate the system throughput, because as the number of users increases Hinput.ij, the available throughput Cfree decreases, which in turn leads to an increase in the service delay time and the growth of lost data, as shown in Figure 11.   In the work, for flexible resource management, it is proposed to estimate the system throughput, because as the number of users increases H input.ij , the available throughput C free decreases, which in turn leads to an increase in the service delay time and the growth of lost data, as shown in Figure 11.
(e) (f) Figure 10. Experimental results of developed big data analysis 3G mobile system: (a) The incoming VoIP load on 3G; (b) the incoming internet downloading on 3G; (c) the incoming internet uploading load on 3G; (d) the incoming IPTV load on 3G; (e) the incoming web load on 3G; (f) the incoming videoconference load on 3G.
In the work, for flexible resource management, it is proposed to estimate the system throughput, because as the number of users increases Hinput.ij, the available throughput Cfree decreases, which in turn leads to an increase in the service delay time and the growth of lost data, as shown in Figure 11. For the optimal resource management, a comprehensive calculation method is proposed in this paper, which allows an even distribution of input load in a heterogeneous network. To determine the maximum throughput capacity of a heterogeneous network, we use the following formula: where Cmax is the maximum heterogeneous network throughput and Cmax.ij is the maximum throughput of i-th technology of j-th service, n is the number of technologies in the heterogeneous network.  Figure 11. Influence of the parameter of the current system throughput utilization on latency and losses of data.
For the optimal resource management, a comprehensive calculation method is proposed in this paper, which allows an even distribution of input load in a heterogeneous network. To determine the maximum throughput capacity of a heterogeneous network, we use the following formula: where C max is the maximum heterogeneous network throughput and C max.ij is the maximum throughput of i-th technology of j-th service, n is the number of technologies in the heterogeneous network.
where C QoS.j is the throughput needed to provide j-th service. Maximum allowable number of active user sessions for different technologies (P max.ij ) is presented in the Table 1 and services throughput requirements (C QoS.j ) are shown in Table 2. After calculating the maximum throughput of the network, proceed to the calculation of the throughput allocated to serve the current active sessions.
where C cur.ij is the throughput of the i-th technology of the j-th service, allocated for active sessions.
where C cur is the total throughput, allocated by the heterogeneous network to serve the current load. For each of the technologies it is necessary to calculate the current throughput. Consider an example of calculating throughput for each of the technologies: We proceed to calculate the throughput to be allocated for requests coming from users.
where C req is the throughput needed to serve incoming requests from users in a heterogeneous network. C loss = C max − (C req + C cur ). (29) where C loss is the throughput which is lost if the number of requests exceeds the maximum permissible values for theinfluence of the parameter of the current system throughput utilization on latency and losses of data.

Experimental Results
In order to improve the efficiency of a heterogeneous network system, the paper proposes the implementation of a comprehensive method of optimization, including the following: • Dynamic reservation of the number of sessions, which allows to evenly distribute the load from users (proposed method 1); • Rejection of non-priority sessions to provide guaranteed quality of service to priority users (proposed method 2).
First, let us consider the operation of a homogeneous network when mobile technologies work independently (traditional method 1) as described in paper [19]. The block diagram of the operation of each technology, separately, is depicted in Figure 12.
Based on the analysis of homogeneous networks, histograms of the distribution of the number of active and lost requests for each available service show the state of each individual network. The total number of active and lost requests in homogeneous networks is depicted in Figure 13. Figure 12 shows that the base station with 2G and 3G technology are overloaded, while all other base stations of other mobile technologies are idle.
Thus, the use of homogeneous networks is not optimal, because in the case of overloading of some technologies, requests from users are dropped, which leads to a degradation of the quality of service as well as a decrease in the network performance. Therefore, in this thesis it is proposed to use a heterogeneous network (traditional method 2), proposed by the authors in [17], to improve the quality of service as well as to reduce the number of dissatisfied customers.
poses the implementation of a comprehensive method of optimization, including the following: • Dynamic reservation of the number of sessions, which allows to evenly distribute the load from users (proposed method 1); • Rejection of non-priority sessions to provide guaranteed quality of service to priority users (proposed method 2).
First, let us consider the operation of a homogeneous network when mobile technologies work independently (traditional method 1) as described in paper [19]. The block diagram of the operation of each technology, separately, is depicted in Figure 12. Based on the analysis of homogeneous networks, histograms of the distribution of the number of active and lost requests for each available service show the state of each individual network. The total number of active and lost requests in homogeneous networks is depicted in Figure 13.  Figure 12 shows that the base station with 2G and 3G technology are overloaded, while all other base stations of other mobile technologies are idle.
Thus, the use of homogeneous networks is not optimal, because in the case of overloading of some technologies, requests from users are dropped, which leads to a degradation of the quality of service as well as a decrease in the network performance. Therefore, in this thesis it is proposed to use a heterogeneous network (traditional method 2), proposed by the authors in [17], to improve the quality of service as well as to reduce the number of dissatisfied customers.
With the increasing number of users and complexity of services, one communication network cannot meet all QoS requirements. Heterogeneous networks are, therefore, an effective solution. Vertical handover (VHO) is an important step in the convergence process of heterogeneous networks. An appropriate handover algorithm can improve the quality of service (QoS) of users. To solve the problem of network congestion caused by a large number of users connecting to partial networks in heterogeneous networks, a loadbalanced vertical handover algorithm (Figure 9) is proposed. Among the networks that can meet the user requirements, the most load-balanced network is obtained, using the optimal algorithm. In addition, to take into account user requirements and ensure QoS of the network service, we use a process of analytical hierarchy using big data technology to weight different networks. This method allows the full use of different network resources, so that the load distribution between networks is average.
We proposed an approach for the study of the processes of functioning of the heter- With the increasing number of users and complexity of services, one communication network cannot meet all QoS requirements. Heterogeneous networks are, therefore, an effective solution. Vertical handover (VHO) is an important step in the convergence process of heterogeneous networks. An appropriate handover algorithm can improve the quality of service (QoS) of users. To solve the problem of network congestion caused by a large number of users connecting to partial networks in heterogeneous networks, a load-balanced vertical handover algorithm (Figure 9) is proposed. Among the networks that can meet the user requirements, the most load-balanced network is obtained, using the optimal algorithm. In addition, to take into account user requirements and ensure QoS of the network service, we use a process of analytical hierarchy using big data technology to weight different networks. This method allows the full use of different network resources, so that the load distribution between networks is average.
We proposed an approach for the study of the processes of functioning of the heterogeneous network environment, which, unlike the known ones, uses the technique of processing large amounts of data to perform monitoring of information transfer, analysis of tasks that are formed by network users and output of statistical data on the initiation of handover in the infrastructure of mobile communications. This approach allowed us to investigate the process of optimizing the operator's network by implementing an algorithm to redistribute its network resources and provide flexible load balancing.
The principle of operation of a heterogeneous network with static reservation of resources is presented in Figure 14. Requests from the UE Figure 14. Block diagram of the heterogeneous network functioning process (traditional method 2). Figure 14 shows the operation of a heterogeneous network; in this case, the total load coming from users is distributed between the base stations according to the type of service and reserved resource on the BS of each technology.
Let us consider the operation of a heterogeneous network in the case of an incoming load, given in the previous example for the set of homogeneous networks. The total number of active and lost requests in heterogenous networks is depicted in Figure 15. As can be seen in Figure 16, the amount of load from users was redistributed. Redistribution occurred in accordance with the statistical reservation of the number of sessions allocated to a particular type of service. Incoming requests were redistributed among the BSs in the heterogeneous network, the number of lost requests decreased significantly, and the performance of the heterogeneous network increased.
A block diagram of a heterogeneous network with load balancing is depicted in Figure 16.  Figure 14 shows the operation of a heterogeneous network; in this case, the total load coming from users is distributed between the base stations according to the type of service and reserved resource on the BS of each technology.
Let us consider the operation of a heterogeneous network in the case of an incoming load, given in the previous example for the set of homogeneous networks. The total number of active and lost requests in heterogenous networks is depicted in Figure 15. Requests from the UE Figure 14. Block diagram of the heterogeneous network functioning process (traditional method 2). Figure 14 shows the operation of a heterogeneous network; in this case, the total load coming from users is distributed between the base stations according to the type of service and reserved resource on the BS of each technology.
Let us consider the operation of a heterogeneous network in the case of an incoming load, given in the previous example for the set of homogeneous networks. The total number of active and lost requests in heterogenous networks is depicted in Figure 15. As can be seen in Figure 16, the amount of load from users was redistributed. Redistribution occurred in accordance with the statistical reservation of the number of sessions allocated to a particular type of service. Incoming requests were redistributed among the BSs in the heterogeneous network, the number of lost requests decreased significantly, and the performance of the heterogeneous network increased.
A block diagram of a heterogeneous network with load balancing is depicted in Figure 16. As can be seen in Figure 16, the amount of load from users was redistributed. Redistribution occurred in accordance with the statistical reservation of the number of sessions allocated to a particular type of service. Incoming requests were redistributed among the BSs in the heterogeneous network, the number of lost requests decreased significantly, and the performance of the heterogeneous network increased.
A block diagram of a heterogeneous network with load balancing is depicted in Figure 16.  Figure 17 shows the resource allocation in a heterogeneous network, based on the dynamic reservation of the number of sessions for different types of services. The decision on the required number of sessions for each technology is made after estimating the resources needed to serve the incoming load at time t. This redistribution improves the performance of the heterogeneous network, utilizes all available resources and adapts to the incoming load. To assess the optimization of the network, the normalized value of the resources of each of the technologies is taken into account. Performance evaluation of the proposed integrated method (static and dynamic reservation) for heterogeneous mobile network using big data is depicted in Figure 18. The KPI (Key Performance Indicator) is a quantifiable indicator of the results actually achieved by implementing our solutions. This indicator characterizes the ratio between the achieved result and the resources utilized by the network from the maximum available to it (in our work, such a resource is the maximum network capacity for a different wireless technology). Network capacity is the amount of traffic that a network can handle at any given time. This includes the number of maximum network throughput or maximum active sessions as depicted in Figure 8, highlighted in a red box. In our work, to compare our solutions with the known, we classify KPIs as K1,  Figure 17 shows the resource allocation in a heterogeneous network, based on the dynamic reservation of the number of sessions for different types of services. The decision on the required number of sessions for each technology is made after estimating the resources needed to serve the incoming load at time t. This redistribution improves the performance of the heterogeneous network, utilizes all available resources and adapts to the incoming load. Requests from the UE Figure 16. Block diagram of a heterogeneous network functioning process with load balancing (our proposed method 1). Figure 17 shows the resource allocation in a heterogeneous network, based on the dynamic reservation of the number of sessions for different types of services. The decision on the required number of sessions for each technology is made after estimating the resources needed to serve the incoming load at time t. This redistribution improves the performance of the heterogeneous network, utilizes all available resources and adapts to the incoming load. To assess the optimization of the network, the normalized value of the resources of each of the technologies is taken into account. Performance evaluation of the proposed integrated method (static and dynamic reservation) for heterogeneous mobile network using big data is depicted in Figure 18. The KPI (Key Performance Indicator) is a quantifiable indicator of the results actually achieved by implementing our solutions. This indicator characterizes the ratio between the achieved result and the resources utilized by the network from the maximum available to it (in our work, such a resource is the maximum network capacity for a different wireless technology). Network capacity is the amount of traffic that a network can handle at any given time. This includes the number of maximum network throughput or maximum active sessions as depicted in Figure 8, highlighted in a red box. In our work, to compare our solutions with the known, we classify KPIs as K1, To assess the optimization of the network, the normalized value of the resources of each of the technologies is taken into account. Performance evaluation of the proposed integrated method (static and dynamic reservation) for heterogeneous mobile network using big data is depicted in Figure 18. The KPI (Key Performance Indicator) is a quantifiable indicator of the results actually achieved by implementing our solutions. This indicator characterizes the ratio between the achieved result and the resources utilized by the network from the maximum available to it (in our work, such a resource is the maximum network capacity for a different wireless technology). Network capacity is the amount of traffic that a network can handle at any given time. This includes the number of maximum network throughput or maximum active sessions as depicted in Figure 8, highlighted in a red box. In our work, to compare our solutions with the known, we classify KPIs as K1, K2, K3, where coefficient K1 is an KPI indicator that determines the total performance of homogeneous networks (traditional method 1). The coefficient K2 is an indicator that determines the performance of a heterogeneous network with static reservation of the number of sessions for each type of service (traditional method 2). The coefficient K3 is an indicator that determines the performance of a heterogeneous network with dynamic reservation of the number of sessions for each type of service (our proposed method). K2, K3, where coefficient K1 is an KPI indicator that determines the total performance of homogeneous networks (traditional method 1). The coefficient K2 is an indicator that determines the performance of a heterogeneous network with static reservation of the number of sessions for each type of service (traditional method 2). The coefficient K3 is an indicator that determines the performance of a heterogeneous network with dynamic reservation of the number of sessions for each type of service (our proposed method). Figure 18. KPI evaluation of the proposed integrated method with traditional methods for heterogeneous mobile network using big data. Figure 18 shows the result of a comprehensive method to improve the performance of a heterogeneous network system, which allows an overall increase in network performance of 29%, compared to the existing homogeneous systems.
A block diagram of a heterogeneous network functioning process with load balancing and user prioritization is depicted in Figure 19. Requests from the UE Figure 19. Block diagram of a heterogeneous network functioning process with load balancing and user prioritization (our proposed method 2). Figure 19 presents a block diagram of the distribution of incoming requests according to user priority. It is important for the operator to meet the needs of priority users. When the input load in a heterogeneous network exceeds the resources of this network and requests from users are rejected, it is necessary to take into account the priority of requests  Figure 18 shows the result of a comprehensive method to improve the performance of a heterogeneous network system, which allows an overall increase in network performance of 29%, compared to the existing homogeneous systems.
A block diagram of a heterogeneous network functioning process with load balancing and user prioritization is depicted in Figure 19. K2, K3, where coefficient K1 is an KPI indicator that determines the total performance of homogeneous networks (traditional method 1). The coefficient K2 is an indicator that determines the performance of a heterogeneous network with static reservation of the number of sessions for each type of service (traditional method 2). The coefficient K3 is an indicator that determines the performance of a heterogeneous network with dynamic reservation of the number of sessions for each type of service (our proposed method). Figure 18. KPI evaluation of the proposed integrated method with traditional methods for heterogeneous mobile network using big data. Figure 18 shows the result of a comprehensive method to improve the performance of a heterogeneous network system, which allows an overall increase in network performance of 29%, compared to the existing homogeneous systems.
A block diagram of a heterogeneous network functioning process with load balancing and user prioritization is depicted in Figure 19. Requests from the UE Figure 19. Block diagram of a heterogeneous network functioning process with load balancing and user prioritization (our proposed method 2). Figure 19 presents a block diagram of the distribution of incoming requests according to user priority. It is important for the operator to meet the needs of priority users. When the input load in a heterogeneous network exceeds the resources of this network and requests from users are rejected, it is necessary to take into account the priority of requests  Figure 19. Block diagram of a heterogeneous network functioning process with load balancing and user prioritization (our proposed method 2). Figure 19 presents a block diagram of the distribution of incoming requests according to user priority. It is important for the operator to meet the needs of priority users. When the input load in a heterogeneous network exceeds the resources of this network and requests from users are rejected, it is necessary to take into account the priority of requests to reduce the level of user dissatisfaction. Accordingly, the paper proposes to analyze the priority of requests from the UE. In this case, priority requests will go to active sessions, and not priority to be discarded. If the network resources are not enough to service priority requests, then we analyze the active sessions and reject non-priority sessions. Figure 20 shows a diagram of the network operation with the analysis of the priority of users. In this case, the heterogeneous network is maximally loaded. At time t there are 100 priority requests from users, which can no longer be served. In order to allocate resources for priority users, we conduct a priority analysis of the current sessions that are non-priority but take up a large amount of resources. In this case, to serve 50 requests for video conferencing and 50 requests for internet data, 15 current sessions (web) and 12 sessions (iptv) were rejected. So, neglecting the small number of non-priority requests, guaranteed bandwidth was provided to requests coming from priority clients, thereby reducing the number of dissatisfied users. to reduce the level of user dissatisfaction. Accordingly, the paper proposes to analyze the priority of requests from the UE. In this case, priority requests will go to active sessions, and not priority to be discarded. If the network resources are not enough to service priority requests, then we analyze the active sessions and reject non-priority sessions. Figure 20 shows a diagram of the network operation with the analysis of the priority of users. In this case, the heterogeneous network is maximally loaded. At time t there are 100 priority requests from users, which can no longer be served. In order to allocate resources for priority users, we conduct a priority analysis of the current sessions that are non-priority but take up a large amount of resources. In this case, to serve 50 requests for video conferencing and 50 requests for internet data, 15 current sessions (web) and 12 sessions (iptv) were rejected. So, neglecting the small number of non-priority requests, guaranteed bandwidth was provided to requests coming from priority clients, thereby reducing the number of dissatisfied users.

Conclusions
We analyzed the existing problems in modern communication networks, such as the fact that the main problem of mobile networks is the focus on the coverage area, not on the user, and the inability of the network to adaptively respond to bursts of large amounts of data created by the user. Therefore, the paper proposes to develop a heterogeneous network, which will be user-oriented and allow the network operator to analyze and predict user behavior through the use of cloud technologies.
In the work, a study of technologies for the effective functioning of heterogeneous mobile networks was carried out. It analyzes the current technology and gives an opportunity to improve the quality of service. On the basis of the data received as a result of the study, the model for improving the performance of a heterogeneous network using a big data processing system was developed. The comparison of criteria for effective resource management was carried out and the choice of the criterion of maximum uniform loading of the mobile network was made.
We developed a comprehensive method for flexible resource management in a heterogeneous network, including statistical resource reservation for a certain type of service in each technology, dynamic resource reservation, and user priority analysis, which can reduce the number of dissatisfied customers.
We proposed the use of big data technologies for optimal resource management in mobile networks. For effective data analysis in a heterogeneous system, we classified the data into two types: user data and network operators' data. The analysis of both types of data allows us to isolate valuable information that is used for network optimization and flexible resource management.

Conclusions
We analyzed the existing problems in modern communication networks, such as the fact that the main problem of mobile networks is the focus on the coverage area, not on the user, and the inability of the network to adaptively respond to bursts of large amounts of data created by the user. Therefore, the paper proposes to develop a heterogeneous network, which will be user-oriented and allow the network operator to analyze and predict user behavior through the use of cloud technologies.
In the work, a study of technologies for the effective functioning of heterogeneous mobile networks was carried out. It analyzes the current technology and gives an opportunity to improve the quality of service. On the basis of the data received as a result of the study, the model for improving the performance of a heterogeneous network using a big data processing system was developed. The comparison of criteria for effective resource management was carried out and the choice of the criterion of maximum uniform loading of the mobile network was made.
We developed a comprehensive method for flexible resource management in a heterogeneous network, including statistical resource reservation for a certain type of service in each technology, dynamic resource reservation, and user priority analysis, which can reduce the number of dissatisfied customers.
We proposed the use of big data technologies for optimal resource management in mobile networks. For effective data analysis in a heterogeneous system, we classified the data into two types: user data and network operators' data. The analysis of both types of data allows us to isolate valuable information that is used for network optimization and flexible resource management.
As a practical result, we implemented a big data processing system. For its implementation, we used the cloud computing service DigitalOcean in which we created an account and configured a virtual server for data processing. We used the technology of virtualization at the operating system level with the help of the Docker platform, which allowed us to create two separate virtual containers, where we deployed non-relational database Apache Cassandra and the platform for fast real-time data analysis, Apache Spark. Scripts were developed in the Python programming language to analyze large amounts of data and, by filtering and sequencing, to output the correct data, allowing intelligent decisions to be made for resource management and predicting the behavior of a heterogeneous network.
We evaluated the effectiveness of the proposed solutions and achieved a 16% increase in performance of the heterogeneous network by using the statistical network resource reservation method, compared to homogeneous networks, and a 13% increase by using uniform resource allocation and a dynamic reservation process, compared to the previous method. Through user prioritization, the quality of user service in the heterogeneous network is improved and the number of dissatisfied customers is reduced.