Model of User Data Analysis Complex for the Management of Diverse Web Projects during Crises

: This article discusses the relevant task of analyzing user data in the process of managing various web projects. The results of this analysis will help to improve the management of diverse web projects during crises. The authors explore the concept of data heterogeneity in web projects, classify web projects by function and purpose, and analyze the search models and data display in web projects. The proposed algorithms for analyzing user data in the process of managing diverse web projects will improve the structuring and presentation of data on the web project platform. The model user data analysis complex developed by the authors will simplify the process of managing various web projects during crises.


Introduction
Web projects are mainly decentralized, unpredictable, multi-layered and difficult to manage. Web project managers are constantly improving their functionality and consolidating a huge amount of disparate user data. It is for these reasons that the management of large web projects during crises is difficult. Therefore, it is necessary to research the current scientific guidelines of web project analysis, paying closer attention to the approaches and complexities of identifying user data, and the methods of diverse web project management.
Considering the significant practical and theoretical results of research in related fields, web projects should be analyzed as heterogeneous data environments [1][2][3][4][5][6] and as content sources [7][8][9][10]. As conventional messaging and news-distribution-oriented web projects are gradually being transformed into video hosting with the ability to stream video online in real time [11][12][13][14], the speed of information is measured in seconds. Therefore, the development of new practices and management methods of diverse web projects during crises is an important task. A detailed analysis of all the factors and influences that affect the management of web projects during crises [15,16] will allow us to predict the direction of web service development and the key threats in the process of web project management [17][18][19][20]. There are many different solutions for data discovery and analysis in web projects. However, the task of forming methods and tools for the adaptive analysis of heterogeneous interconnected databases in web projects (in a constantly changing data structure) stands unresolved. All available web project analysis techniques [21][22][23][24][25][26] only partially solve the problem of collecting information from such environments. Designed to quickly integrate users of web projects into the learning process. These communities have specialized functionality.
The Student Room Group, ePALS School Blog Professional web projects Created to promote career growth for users. A community member's profile is presented in the form of a resume, which allows recruitment specialists to directly assess the level of qualification and compliance with vacancies in companies. Users can use infographics to display their career history. Professional communities also include academic web communities. The developed classification was formed according to the purpose of each diverse web project. However, the automated analysis of user page data, the standardization of platform content, the ability to access user data, the data presentation architecture, and crisis factors in web project management are all equally important indicators. In different web projects of the same group of communities, these indicators may differ dramatically.
The ability to identify each user is one of the most important features of a web project. In order to be a full participant in a web project, one must register in this environment as a user. The procedure for registering a new user involves filling in the user's personal data. A personal profile with contact data of the registered person is then created, along with a unique profile identifier and functionality for communication with other users in this environment.

Methods and Techniques of Data Collection
The functionality and interface of diverse web projects are usually standardized [26], which provides minimal variability in page design for optimal perception and input of user data, such as personal data and content [15,25,27]. George Ritzer [28] has studied the notion of the "McDonaldization" of society and the internet through four main components: efficiency, calculability, predictability and control. The policy of data validation between individual web projects is also only informally agreed upon [6]. In addition, in order to avoid complex situations that provoke the emergence and deepening of crisis situations, administrators monitor the reality of posted photos and other media content, blocking the pages of users who upload misleading content.
A feature of most web projects is the maximum correspondence of the user's profile data to their personal data and servitization [29][30][31][32][33][34]. In web projects, users are registered under their real names, upload photos and videos in which they are depicted [35,36], and virtually communicate with friends and people they know in real life.
A diverse web project has the following characteristics [9]: • Discretion (user profiles are separate and linked). • Similarity (web project profile characteristics are identical). • Proximity (links between user profiles are in a single space-time cycle). • Reciprocity (interaction between user profiles is symmetrical, which is one of the prerequisites for the exchange of resources and information).
The diverse web project presentation model ( Figure 1) sets the web project data display template and selects the technology for presenting this data.

Methods and Techniques of Data Collection
The functionality and interface of diverse web projects are usually standardized [26], which provides minimal variability in page design for optimal perception and input of user data, such as personal data and content [15,25,27]. George Ritzer [28] has studied the notion of the "McDonaldization" of society and the internet through four main components: efficiency, calculability, predictability and control. The policy of data validation between individual web projects is also only informally agreed upon [6]. In addition, in order to avoid complex situations that provoke the emergence and deepening of crisis situations, administrators monitor the reality of posted photos and other media content, blocking the pages of users who upload misleading content.
A feature of most web projects is the maximum correspondence of the user's profile data to their personal data and servitization [29][30][31][32][33][34]. In web projects, users are registered under their real names, upload photos and videos in which they are depicted [35,36], and virtually communicate with friends and people they know in real life.
A diverse web project has the following characteristics [9]: • Discretion (user profiles are separate and linked). Reciprocity (interaction between user profiles is symmetrical, which is one of the prerequisites for the exchange of resources and information).
The diverse web project presentation model ( Figure 1) sets the web project data display template and selects the technology for presenting this data.  The areas where users can publish discussions are set by this model. These areas can be any of the following: user pages, group pages, a dedicated part of a page, special pages, etc. The rules of message distribution are also set by the presentation model; for example, one user's post can be The areas where users can publish discussions are set by this model. These areas can be any of the following: user pages, group pages, a dedicated part of a page, special pages, etc. The rules of message distribution are also set by the presentation model; for example, one user's post can be placed on their page and can also be distributed on other pages through publications and messages. This post may be shared by either the author or another user who has access to it. As a result, it is possible to count each user who distributed the post and to whom it was distributed. One post by one user can effectively be on many pages (1 . . . *). However the post cannot have multiple authors.
It is also possible to track the importance of a post by receiving the number of messages relevant to it. Based on the comment functionality, users can rate publications and posts. It is also possible to build a schedule of a user's activity by obtaining the number of comments on various posts on different pages, and other actions.
A competently thought-out visual model with a comprehensive vision of the process of the diverse web project's functions will greatly simplify the management of this diverse web project.
The visual model defines the overall look of the diverse web project platform, and it may also hide some components that are present in the functionality of the project. The rules for combining page tags, their attributes, and the general appearance of the document object model of the pages are set with the visual model. Whether the name and values of the attributes will change can be viewed or set on this model.
Each user node on the platform has its own display. The visual representation of the user profile looks like Equation (1): where VisualElem belongs to a set of visual elements displayed on the platform user's page. The model of interaction organization of the web project server (2) with the user is organized as follows: • Client user (browser or application) accesses the web project platform through the navigation bar; • When the user accesses the navigation bar, specific parameters for each community node are passed; • Additional information is written in the headers and cookies of the request.
The body of the message records the basic user information needed to identify the user.
The result of the request is a response from the server to the client. The structure of responses to customer inquiries is based on the navigation model (3): The answer includes: • Body, which contains basic user information; • Headers, which contain meta data about the web project and response; • Cookies, which preferably contain user identification data, a session ID, and additional content that is specific to each page of the system-cookies are stored on the client side; • Server info, which contains specific data of the web project platform.
After forming models for presenting the web project and its information content, the modeling of the software package for detection and analysis of user data in the web project can begin.
A web project platform consists of interconnected user pages. The visual model allows the content of the platform page to be presented as a hierarchical tree of nodes that contain all of the data. Analysis of this data opens the possibility of obtaining comprehensive information about the user of the platform.
The process of submitting web project data with their subsequent analysis can be divided into three stages ( Figure 2):

•
Obtaining information: At this stage the exact information is determined and in what way will it be received. The raw input information is pre-processed.

•
Filtering: Redundant information is discarded, and the input data set for the next stage is built. • Data structuring: Data is built on the basis of structuring and classification/clustering. Nodes that carry useful information are selected.
the input set for correctness. These processes take place in Zone 2. 3. In Zone 3, data is formatted and supplemented. For correct structuring of the data it may be necessary to receive additional information, and in this case it is necessary to pass to a stage of reception of the information. These processes run cyclically until a complete data structure is obtained. The combination of these three stages forms the concept of data analysis in a web project.

Obtaining information Filtering
Data structuring The prerequisites for analyzing web communities are as follows:

ANALYSIS
• Presenting search criteria for relevant information; • Presenting the algorithm for analyzing the user page on the web project; • Presenting the algorithm for forming a structural tree of page searches; • Full access to the functionality of the web project platform for the software; • Access to the environments of the received information storage for the software.
To implement the conditions, it is necessary: • To build the functionality for user-submitted input data that will set the search criteria for information; • To have the presence of existing research on the structure of the user page in the web project; • To use third-party libraries and analyze the main nodes of the user page on the web project; • To build a software authorization mechanism in each web project.
The analysis of each page of the diverse web project is carried out in stages: • Stage 1. Analysis of web project page headers and metadata. The stages of analysis are in a clearly defined order, are interconnected, and constantly intersect. The intersection between these stages will be called zones. There are three such zones:

1.
Zone 1 is an area where the received information is pre-filtered in order to select key features. Zone 1 is intermediate between the first and second stages, here the information is converted into data.

2.
During data filtering there is a partial structuring of data and it is necessary to additionally check the input set for correctness. These processes take place in Zone 2.

3.
In Zone 3, data is formatted and supplemented. For correct structuring of the data it may be necessary to receive additional information, and in this case it is necessary to pass to a stage of reception of the information. These processes run cyclically until a complete data structure is obtained. The combination of these three stages forms the concept of data analysis in a web project.
The prerequisites for analyzing web communities are as follows: • Presenting search criteria for relevant information; • Presenting the algorithm for analyzing the user page on the web project; • Presenting the algorithm for forming a structural tree of page searches; • Full access to the functionality of the web project platform for the software; • Access to the environments of the received information storage for the software.

•
To implement the conditions, it is necessary: • To build the functionality for user-submitted input data that will set the search criteria for information; • To have the presence of existing research on the structure of the user page in the web project; • To use third-party libraries and analyze the main nodes of the user page on the web project; • To build a software authorization mechanism in each web project.
The analysis of each page of the diverse web project is carried out in stages: • Stage 1. Analysis of web project page headers and metadata. • Stage 4. Saving the received data to the database and system resources.
The personal information area contains regions with personal photos of the profile owner. One way to analyze a photograph is to read the attributes of that photograph in order to obtain its metadata (e.g., time and date the photograph was created, coordinates). If the recorder has a built-in GPS receiver and geotagging is enabled at the time the photo was taken, it is also possible to retrieve the coordinates of where the image was taken.
Detailed analysis of photography can be performed using deep neural network learning algorithms. The presented software complex uses computer vision, which is used to analyze all available photos on the user's profile page. The computer vision algorithms used are based on technology that represents the visual image in the form of a branched tree-like hierarchical structure, which accounts for the results of its learning process at the nodes of the tree and retains the probability of transition to the appropriate level of the tree. This structure is stored as an XML file, which is processed at every step of the algorithm.
The algorithm of diverse web project page analysis is shown in Figure 3. The basis of the implemented algorithm is a constructed model of available content.
• Stage 2. Working with page areas. There are four main areas of the page, namely: personal information, multimedia content, direct links and posts. Since the presented zones are isolated and structurally independent units of the page content itself, the analysis of zones is carried out in several different ways that are parallel to each other. • Stage 3. Direct retrieval of data from the elements of each zone. The list of elements and metadata for their analysis must be stored in a database. • Stage 4. Saving the received data to the database and system resources.
The personal information area contains regions with personal photos of the profile owner. One way to analyze a photograph is to read the attributes of that photograph in order to obtain its metadata (e.g., time and date the photograph was created, coordinates). If the recorder has a built-in GPS receiver and geotagging is enabled at the time the photo was taken, it is also possible to retrieve the coordinates of where the image was taken.
Detailed analysis of photography can be performed using deep neural network learning algorithms. The presented software complex uses computer vision, which is used to analyze all available photos on the user's profile page. The computer vision algorithms used are based on technology that represents the visual image in the form of a branched tree-like hierarchical structure, which accounts for the results of its learning process at the nodes of the tree and retains the probability of transition to the appropriate level of the tree. This structure is stored as an XML file, which is processed at every step of the algorithm.
The algorithm of diverse web project page analysis is shown in Figure 3. The basis of the implemented algorithm is a constructed model of available content.

Analyzer
Step 1

Search and analysis of personal data area
Step 2

Search and analysis of multimedia content
Step 3

Formation of search patterns of elements
Step 4

Items' search queries
Obtaining data and metadata elements

Formation of additional requests
Sending additional requests Special attention should be paid to the analysis of user posts on the diverse web project. The algorithm for analyzing a diverse web project user message is shown in Figure 4.
Existing algorithms can be used to obtain a page tree of users of this diverse web project, and their post structure can be obtained in this way. The posts' addresses will be submitted instead of the user addresses during the algorithm's initialization.  The organization of links between posts corresponds to the organization of links between users of the diverse web project. After receiving the data from the web pages of the diverse web project, they need to be fitted to the desired form by deleting unnecessary information and getting rid of garbage. These actions are performed at the data filtering stage, before directly saving the information in the database.
The filtering stage can be divided into three components: normalization, filtering, thinning and division of the tree into zones.
Normalization and filtering are required to remove information debris, that is, information that is of no use and is clogging up relevant data for analysis. The initial stage of document normalization is the verification for compliance and correctness of encoding.
To facilitate the work with the obtained data, they need to be reduced to a standardized form (case, word structure, removal of extra spaces, removal of repetitions). Next, it is necessary to clean the data by removing additional tags (<head>, <title>, <meta>, <frame>), leaving only the elements that are directly related to the content.
The data normalization stage stabilizes the data (errors will not arise in the following stages). In this way, unnecessary information is removed, ensuring better performance in the following steps. In turn, the filtering stage ensures the removal of unnecessary information. The thinning step reduces the amount of information to be read by deleting duplicates. Once the data has been thinned and filtered, the next step is the structuring of the data ( Figure 5). How well and quickly the page analysis software will work depends heavily on this stage . The first step is the data submission process. Existing algorithms can be used to obtain a page tree of users of this diverse web project, and their post structure can be obtained in this way. The posts' addresses will be submitted instead of the user addresses during the algorithm's initialization.
The organization of links between posts corresponds to the organization of links between users of the diverse web project. After receiving the data from the web pages of the diverse web project, they need to be fitted to the desired form by deleting unnecessary information and getting rid of garbage. These actions are performed at the data filtering stage, before directly saving the information in the database.
The filtering stage can be divided into three components: normalization, filtering, thinning and division of the tree into zones.
Normalization and filtering are required to remove information debris, that is, information that is of no use and is clogging up relevant data for analysis. The initial stage of document normalization is the verification for compliance and correctness of encoding.
To facilitate the work with the obtained data, they need to be reduced to a standardized form (case, word structure, removal of extra spaces, removal of repetitions). Next, it is necessary to clean the data by removing additional tags (<head>, <title>, <meta>, <frame>), leaving only the elements that are directly related to the content.
The data normalization stage stabilizes the data (errors will not arise in the following stages). In this way, unnecessary information is removed, ensuring better performance in the following steps. In turn, the filtering stage ensures the removal of unnecessary information. The thinning step reduces the amount of information to be read by deleting duplicates. Once the data has been thinned and filtered, the next step is the structuring of the data ( Figure 5).  The organization of links between posts corresponds to the organization of links between users of the diverse web project. After receiving the data from the web pages of the diverse web project, they need to be fitted to the desired form by deleting unnecessary information and getting rid of garbage. These actions are performed at the data filtering stage, before directly saving the information in the database.
The filtering stage can be divided into three components: normalization, filtering, thinning and division of the tree into zones.
Normalization and filtering are required to remove information debris, that is, information that is of no use and is clogging up relevant data for analysis. The initial stage of document normalization is the verification for compliance and correctness of encoding.
To facilitate the work with the obtained data, they need to be reduced to a standardized form (case, word structure, removal of extra spaces, removal of repetitions). Next, it is necessary to clean the data by removing additional tags (<head>, <title>, <meta>, <frame>), leaving only the elements that are directly related to the content.
The data normalization stage stabilizes the data (errors will not arise in the following stages). In this way, unnecessary information is removed, ensuring better performance in the following steps. In turn, the filtering stage ensures the removal of unnecessary information. The thinning step reduces the amount of information to be read by deleting duplicates. Once the data has been thinned and filtered, the next step is the structuring of the data ( Figure 5). How well and quickly the page analysis software will work depends heavily on this stage . The first step is the data submission process. How well and quickly the page analysis software will work depends heavily on this stage. The first step is the data submission process.
Data from the text format should be converted to another, more machine-readable, format. These formats can be XML, JSON, Excel documents, etc.
The developed model of the user data analysis complex for the management of diverse web projects is a multilevel web service. This complex is built based on the principles of SOLID (object-oriented computer programming).
The model of the user data analysis complex is shown in Figure 6.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 8 of 13 Data from the text format should be converted to another, more machine-readable, format. These formats can be XML, JSON, Excel documents, etc.
The developed model of the user data analysis complex for the management of diverse web projects is a multilevel web service. This complex is built based on the principles of SOLID (objectoriented computer programming).
The model of the user data analysis complex is shown in Figure 6.

Visual Layer
Web client

Mobile client
Desktop client

Database level
Resource level

User Administrator
Layer of Background Processes Figure 6. Model of the user data analysis complex for the management of diverse web projects.
The developed model consists of the following layers: visual layer, business layer, data layer, layer of background processes and support component.
The visual layer helps users to interact with the software-analytical complex with a client application. The data received from the user on the visual layer will be sent in the form of requests for the business layer. The visual layer consists of software solutions: web client (submitted by a website), mobile phone application and desktop application. A web client is a web application built on the principle of MVC (Model-View-Controller) design pattern. The business layer contains the main functionality of the software-analytical complex. The basic unit of the business layer is the essence.
Essence is an element of software that describes a particular type of data and consists of properties and methods. The essences of the business layer are consequently built with the methodologies of object-oriented analysis and design and the paradigm of object-oriented programming. This layer is divided into three levels: the level of controllers, the level of services and the intermediate level. Requests that come from the client are received and analyzed at the level of The developed model consists of the following layers: visual layer, business layer, data layer, layer of background processes and support component.
The visual layer helps users to interact with the software-analytical complex with a client application. The data received from the user on the visual layer will be sent in the form of requests for the business layer. The visual layer consists of software solutions: web client (submitted by a website), mobile phone application and desktop application. A web client is a web application built on the principle of MVC (Model-View-Controller) design pattern. The business layer contains the main functionality of the software-analytical complex. The basic unit of the business layer is the essence.
Essence is an element of software that describes a particular type of data and consists of properties and methods. The essences of the business layer are consequently built with the methodologies of object-oriented analysis and design and the paradigm of object-oriented programming. This layer is divided into three levels: the level of controllers, the level of services and the intermediate level.
Requests that come from the client are received and analyzed at the level of controllers, which is represented by the set of RESTful: the nature of the controller type, routing system and security provider.
Query analysis at this level includes the following steps: • Receiving the request URL by the routing system; • Forwarding the request to the appropriate controller; • Query header analysis; • Identification of the client by the security provider; • Analysis of the request's input parameters; • Forwarding the request further on the stack of calls to functionality and services; • Calling the appropriate services to form a response to the client.
The level of services is represented by a set of services. A service is an entity that receives a request from a controller, analyzes it and generates an appeal to an external microservice and to intermediate-level entities. The level of services is built on the scheme of a standardized design template facade. The intermediate level is represented by a set of ORM (object-relational mapping) systems. The main task of the entities of this level is to equalize the differences between object-oriented programming and functionality designed to store data. The Entity Framework system is used to work with the database. Entities that represent work with database tables are built on the principles of the design templates Unit of Work and Repository. The background process layer is an independent layer, the processes of which work autonomously and independently of the user's actions. This layer is represented by a finite set of processes. Each component solves a separate definite task. The frequency of running each component depends on the event scheduler settings. The main task of the background process layer is to periodically update the data in the database. The main advantage of component independence is the ability to expand the functionality of the software algorithm by adding an additional component. The support component is an additional functionality that expands the functionality of the system to include data provider and auxiliary microservice. The data layer is represented by a set of relational databases and resource storage systems. System resources are stored in separate repositories, which are categorized as multimedia data repositories, configuration file repositories and document storage.

Results and Discussion
The main result of this investigation is the modeling of the process of searching and analyzing heterogeneous user data to improve the functioning of diverse web projects. This model was tested by analyzing diverse web projects implemented on popular platforms, namely: Facebook, YouTube, Instagram, LinkedIn, Pinterest and TikTok.
The following approaches were used for testing: • Search for user data in diverse platforms of web projects with further consolidation into a single profile for each user; • Quantitative analysis of users of each web project platform.
The period of active research on the diverse web project management system was September 2020-November 2020. During this period, a total of 959,485 user profiles of web projects and 7684 web projects were researched and analyzed (see Figure 7 and The age of users of diverse web projects was determined by examining the relevant field of the user's profile in the area of personal information of the user. If such a field was not provided when developing user profiles in a diverse web project, then the age of users was automatically determined by the system based on the analysis of uploaded personal photos in the user profile.  Determining the age of the participant by searching and analyzing the data of web project users helps to reduce the level of conflict in projects where there are age restrictions for users (projects only for adults/children/adolescents).  The age of users of diverse web projects was determined by examining the relevant field of the user's profile in the area of personal information of the user. If such a field was not provided when developing user profiles in a diverse web project, then the age of users was automatically determined by the system based on the analysis of uploaded personal photos in the user profile.
Determining the age of the participant by searching and analyzing the data of web project users helps to reduce the level of conflict in projects where there are age restrictions for users (projects only for adults/children/adolescents).

Conclusions
The unprecedentedly large amount of information available in diverse web projects, the large number of their users, the complexity of building relationships between users and the extremely wide scope of diverse web projects have made web project management a complex activity with many risks and unforeseen critical situations. In this article we analyzed the concept of heterogeneity in these web projects, and the process of their structuring. Algorithms for user data analysis for managing various web projects were developed. The developed model of the user data analysis complex for the management of various web projects represents a multilevel web service. The use of this model for diverse web project activities will simplify the management process, reduce the cost of maintaining a community, increase cost-effectiveness, and reduce the number of crisis situations.