Mathematical Model of Data Processing in a Personalized Search Recommendation System for Digital Collections

Semenov, Serhii; Baran, Wojciech; Andrzejewska, Magdalena; Pochebut, Maxim; Petrovska, Inna; Sitnikova, Oksana; Melnyk, Marharyta; Mekhovykh, Anastasiya

doi:10.3390/app15137583

Open AccessArticle

Mathematical Model of Data Processing in a Personalized Search Recommendation System for Digital Collections

by

Serhii Semenov

^1,2,*

,

Wojciech Baran

¹

,

Magdalena Andrzejewska

¹,

Maxim Pochebut

^2,*,

Inna Petrovska

²

,

Oksana Sitnikova

²,

Marharyta Melnyk

² and

Anastasiya Mekhovykh

²

¹

Institute of Security and Computer Sciences, University of the National Education Commission, 30-084 Krakow, Poland

²

Department of Information Technology and Cyber Security, Science Entrepreneurship Technology University, M. Shpaka St., 3, 03113 Kiyv, Ukraine

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2025, 15(13), 7583; https://doi.org/10.3390/app15137583

Submission received: 6 June 2025 / Revised: 3 July 2025 / Accepted: 4 July 2025 / Published: 6 July 2025

(This article belongs to the Special Issue Advanced Models and Algorithms for Recommender Systems)

Download

Browse Figures

Versions Notes

Abstract

This paper presents a probabilistic-temporal modeling approach for analyzing data processing stages in a personalized recommendation system for digital heritage collections. The methodology is based on (Graphical Evaluation and Review Technique) GERT network formalism, which enables the representation of complex probabilistic workflows with feedbacks and alternative branches. For each processing stage, corresponding GERT-schemes were developed, and equivalent transfer functions were derived. Using Laplace transform inversion techniques, probability density functions of processing time were recovered, followed by the calculation of key statistical metrics, including expectation, standard deviation, and quantiles. The results demonstrate that the proposed approach allows for detailed temporal performance evaluation, including the estimation of time exceedance probabilities at each stage. This provides a quantitative basis for optimizing recommendation system design and highlights the applicability of GERT-based modeling to intelligent data-driven services in the cultural domain.

Keywords:

personalized recommendation; digital collections; GERT networks; user query modeling; cultural heritage informatics; probabilistic systems; time distribution analysis; intelligent systems; service delay estimation; system architecture

1. Introduction

1.1. Motivation

The growing volume and complexity of digital collections in museums and cultural institutions have led to the need for more advanced systems that can support personalized access to heritage information. Recommendation technologies offer users relevant content based on preferences, context, and past interactions. In museum domains, where collections are semantically rich and structurally diverse, such systems must also account for iterative user behavior and varying depths of data interpretation.

Many recommendation systems emphasize algorithmic accuracy but overlook the internal organization of data processing stages. In practice, however, user interaction with a recommendation system often initiates a chain of interrelated steps, such as request interpretation, semantic enrichment, validation, and refinement. Each contributing to the overall response time and system adaptability.

Traditional modeling approaches may be limited in their ability to represent the probabilistic nature of such workflows, particularly when feedback loops and parallel branches are present. For this reason, probabilistic-time-modeling methods, such as GERT, are of interest in this context. GERT enables the representation of branching logic, processing delays, and the likelihood of transitions between operations [1,2].

This modeling paradigm is particularly relevant to the domain of personalized cultural heritage search, where heterogeneous data, uncertainty in user input, and dynamic refinement of responses are typical. A formal time-aware model of these processes can provide insights into system responsiveness, performance bottlenecks, and opportunities for architectural optimization.

1.2. State of the Art

The development of recommender systems has progressed through several distinct stages over the past three decades. In the 1990s, early implementations were primarily based on collaborative filtering techniques, which laid the foundation for personalized content delivery by analyzing patterns in user behavior [3]. The availability of standardized datasets such as MovieLens significantly contributed to benchmarking and popularizing these approaches.

During the 2000s, hybrid recommendation models emerged, combining collaborative filtering with content-based methods to overcome challenges such as data sparsity and cold-start problems. At the same time, probabilistic graphical models and matrix factorization techniques began to be incorporated, increasing the scalability and robustness of recommender algorithms [4].

In the 2010s and beyond, a new generation of intelligent recommender systems has emerged, integrating advanced semantic technologies, knowledge graphs, deep learning, and reinforcement learning [5]. These systems emphasize personalization, explainability, context-awareness, and adaptability, aiming to model not only static preferences but also user intent and session dynamics.

Recent studies also highlight the need for transparent and interpretable systems capable of modeling uncertainty, temporal aspects of interaction, and system reliability [6]. In this regard, the application of probabilistic time-based methods becomes increasingly relevant.

However, despite significant algorithmic advancements, most modern architectures focus on improving accuracy or scalability, often overlooking the formal modeling of structural and temporal logic of recommendation workflows. This gap is particularly evident in domain-specific systems, such as personalized search in cultural heritage or museum collections, where uncertainty, iteration, and human feedback loops play a central role.

Given the practical relevance of the development, the literature references are grouped as follows (Table 1).

The study [7] is a review of the current state of research in the field of recommendation systems, including their typology, methodological foundations, existing challenges, and practical application areas. It emphasizes the role of recommendation algorithms as tools to overcome information overload amid the exponential growth of digital content. The authors analyze various classes of recommendation systems, focusing on their applicability in areas such as e-commerce, healthcare, transport, agriculture, and digital media. Despite its depth and broad scope, the article remains at a high conceptual level. It does not include a description of architectural models focused on structuring user queries, input data validation, step-by-step information processing, or analysis of iterative returns. It also does not address categorization error handling or the logic of forming user models in conditions of unstructured or ambiguous input.

The review [8] covers the current state of recommendation systems, including method classification, analysis of robustness, biases, and algorithm fairness. The article discusses personalized and group systems and explores quality metrics and issues related to interpretability. Special attention is given to modern challenges, including increasing data volumes, the need for adaptability, and fair representation of recommendations. However, despite its broad scope, the authors barely touch on the internal logic structure of the systems. Therefore, there is no detailed step-by-step model of query processing. Furthermore, the authors do not consider temporal factors influence or system behavior cyclicity.

The article [9] is a meta-review of recommendation system research covering the period from 2001 to 2010. The authors analyze 187 publications selected from 37 leading journals, categorizing them by subject areas (e.g., books, movies, and music) and by applied algorithms (kNN, neural networks, decision trees, etc.). The work highlights the growing interest in personalized recommendations while noting the field’s immaturity and a lack of research applicable to practical problems. Specifically, it notes the underrepresentation of recommendations in visual arts and cultural heritage, as well as the limited application of social network analysis methods.

The authors of the study [10] describe the implementation of a content-based recommendation system for creating personalized museum tours (based on the UCL Grant Museum of Zoology, London, UK). The paper presents features of user interaction with the system in a physical space, conducts pilot user tests, and identifies critical limitations. Despite the interesting case study, the system architecture remains simplified: there is no description of internal request processing logic, no modeling of decision-making step sequences, and no modeling of return or route refinement situations.

Article [11] presents a prototype system called CHIP, designed for personalized access to digital museum collections. The authors address three fundamental tasks typical for recommendation systems in the open web environment. First, a content-based method is proposed for handling complex semantic relationships among objects, including both artworks and conceptual art topics. Second, attention is given to typical user modeling problems—the “cold start” effect, rating sparsity, and issues with feedback collection. Third, the visualization of recommendations through various representation channels is discussed, as follows: historical timelines, museum maps, and faceted navigation. The approach’s effectiveness is confirmed by user testing results within a user-centered design cycle. Despite its high innovation and real-world application, the system architecture remains fragmented in terms of formal request processing modeling. There is no stage-wise processing logic, nor are probabilistic or temporal characteristics of user–system interactions specified. Preference refinement cycles, reverse transitions due to interest changes, or categorization errors are not considered.

The article [12] examines the impact of user segmentation on the performance of recommendation algorithms. As a method of data structuring, the authors propose using a scheme similar to the recency–frequency–monetary (RFM) model, which is widely used in marketing. A comparative experiment is conducted with eight algorithms, ranging from simple popularity-based models to graph neural networks, demonstrating that proper segmentation can significantly improve the accuracy of certain models. However, the article only considers superficial aggregation of user attributes without incorporating structural or step-by-step query processing. There are no mechanisms for error control in categorization, rollbacks to previous steps, or analysis of the sequential dynamics of refining user input. Additionally, there is no description of the logic behind the formation of user models in terms of probabilities or the time required to complete various stages.

The article [13] proposes an advanced scheme for semantic user profiling based on the analysis of frequent subgraphs in knowledge graphs. The goal of the work is to preserve the semantic interpretability of the user model when using neural network algorithms. It addresses the problem of meaning loss that arises when using feature vectors in deep learning. The authors successfully integrate substantive links between the user and the subject domain, preserving the semantic context. However, the architecture remains static: there is no description of a structure for step-by-step processing, validation, or reformulation. There is no analysis of rollbacks, error probabilities, or a mechanism for collecting accurate feedback in cases of user preference misunderstandings.

The study [14] presents a two-phase architecture of a recommendation system, in which the key component is a module for verifying the quality of input data. In the first phase, a sentiment predictor is used to cleanse user reviews of inconsistent “rating–comment” pairs, after which the cleaned data are passed on to the core recommendation model. The article raises an important issue regarding the quality of user input and its impact on the final recommendation. However, the system logic lacks an explicit structural decomposition—data validation is implemented as preprocessing, without inclusion in an iterative query processing cycle. There is also no description of how ambiguities or incorrect categories are handled, nor are rollback transitions or error probabilities modeled.

The article [15] proposes an architectural framework for a personalized recommendation system focused on mobile applications. To address the cold start problem and real-time personalization, the system uses machine learning methods, content-based and collaborative filtering, as well as user segmentation. The system includes preference adaptation based on feedback, which can be interpreted as a dynamic form of data validation. However, there is no formalized step-by-step model for query processing, nor a mechanism for evaluating categorization errors.

The study [16] presents the results of a meta-analysis of the state of recommendation systems. The authors classify research into five areas, including architectural and computational aspects, modeling of user behavior, and algorithmic data processing methodologies. Attention is given to critical issues such as interest drift, multitasking, goal conflicts, data leaks, and training set poisoning, all of which directly affect model robustness and accuracy. It is noted that the architectures of most systems suffer from insufficient adaptability and weak support for complex user interaction scenarios. Despite its broad scope, the work remains at a review level: it does not offer a formal step-by-step description of the internal logic of query processing algorithms, lacks decomposition into stages (categorization, validation, rollback, generation), and does not consider probabilistic temporal characteristics. This highlights the need for structural GERT models, which describe algorithms through equivalent graphs with probabilities and delays.

The review [17] focuses on the evolution of recommendation methods in the era of big data. It discusses the algorithmic principles of content-based, collaborative, knowledge-based, and hybrid filtering, with an emphasis on architectural features and implementation challenges. The authors address issues of scalability, data sparsity, integration into real-world user scenarios, and the necessity for recommendation diversity. The most commonly used datasets and application scenarios are also analyzed. While structural challenges are well described, the internal logic of algorithm operation is not detailed. Temporal aspects are likewise not discussed.

The article [18] analyzes recommendation systems from the standpoint of fundamental mathematical approaches. The authors examine probabilistic algorithms and methods for measuring similarity between users or items, emphasizing how the choice of metrics influences accuracy and performance. A comparative analysis of the strengths and weaknesses of different similarity calculation approaches is provided. However, the article lacks a description of the system’s architectural structure in terms of logical stages (categorization, validation, iteration), as well as the temporal dynamics of interaction.

The article [19] proposes a formal method of collaborative filtering using production rules to estimate unknown similarity coefficients between users. The method is based on a graph structure, the application of fuzzy logic, and inference rules. The authors implement a software model and demonstrate improved system accuracy across several metrics. Although the study presents a clear inference structure, the logic of user model construction is limited to the matrix completion phase and does not include a more developed process architecture.

The article [20] explores the use of Markov decision processes for building adaptive recommendation systems. The study highlights the limitations of static models and the necessity of considering the temporal dynamics of user interaction with the system. It presents the logic of an algorithm capable of responding to changes in user preferences based on iterative interaction, which is an important direction in the formalization of data processing algorithms. However, the article remains at the conceptual review level, not offering a formal structural model or assessment of probabilistic transitions between query processing stages, underscoring the relevance of further formalization of such approaches, for example, through the application of GERT models.

The article [21] proposes the TSGNN architecture—a graph neural network model for recommending new items in session-based scenarios. The main contribution of the work lies in introducing a temporal structure into the session graph, enabling the modeling of user interest dynamics. Additionally, mechanisms of graph contrastive learning and temporal attention are implemented to mitigate data sparsity and bias. The work demonstrates a clearly formalized architecture and addresses tasks closely aligned with the logic of step-by-step user query processing. The architecture described in the paper is focused on training a neural network but does not formalize the processing logic in terms of probabilistic transitions or equivalent graphs, making the use of probabilistic network models relevant for the interpretation and quantitative analysis of algorithm structure.

The authors of the article [22] present an analytical GERT model of a recommendation system operating in a decentralized P2P network. The focus is on formalizing probabilistic data transmission processes, taking into account the reliability and security of communication, which allows for the system to be modeled under dynamically changing topologies. The proposed model enables the analysis and comparison of various algorithmic schemes in the presence of stochastic factors. Despite its high degree of formalization, this model is primarily oriented toward transport and network aspects, while structural query processing, staged logic, and user behavioral adaptation are not modeled. Thus, there is a need to extend the GERT approach beyond network characteristics to encompass cognitive and semantic processes within the algorithm for processing user input.

Recent research in the field of recommendation systems has increasingly focused on integrating artificial intelligence methods. As a result, intelligent architectures are emerging that can adapt to user preferences, consider semantic relationships, and ensure the interpretability of results. However, despite their promise, these solutions often lack a formal description of the logic of user query processing, its structure, and a probabilistic-temporal model of interaction. The following examples illustrate this.

The study [23] presents a method of semantic user profiling based on the extraction of frequent subgraphs in knowledge graphs. User preferences are represented not as vectors but as recurring substructures, allowing the original semantics of interactions to be preserved. These structures are then used in a deep neural network model to generate recommendations. The method demonstrated competitive results and high interpretability. However, despite the high degree of semantic adaptation, there is no step-by-step description of the logic behind the recommendation system’s operation or formalization of transitions between phases. This underscores the relevance of models where system logic is described in terms of stochastic processes.

The authors of Reference [24] propose a recommendation system architecture based on knowledge graphs. To address the cold-start problem, the system uses clustering and a shared semantic preference map. The system demonstrates adaptability to new users and scalability with large volumes of information. The architecture is described at the level of functional modules, but it lacks a formalized logic for query processing that considers probabilities, categorization errors, and rollbacks. Temporal delays between stages are also not addressed. The GERT approach allows for the formalization of these aspects, supplementing the proposed modular architecture with a quantitative processing model.

Reference [25] describes a comprehensive intelligent approach that includes reinforcement learning, graph representations, and knowledge distillation. The system is aimed at explainability and robustness of recommendations in the field of career navigation. Reinforcement learning allows for consideration of the user’s trajectory, while knowledge distillation enables adaptation to more computationally efficient models. However, although the model exhibits high cognitive complexity, it remains within the domain of neural network learning and does not formalize the internal stages of algorithm logic. The structure of interaction between query processing phases, transition probabilities, and temporal characteristics are not presented in an analytical form, which limits interpretation and mathematical modeling.

Reference [26] implements the following two-phase architecture: the first phase cleans data using a sentiment analysis model, and the second phase trains the recommendation model solely on “clean” data. This approach addresses the impact of low-quality input on final recommendations. The authors demonstrate that this scheme increases the accuracy and robustness of the system. At the same time, despite the inclusion of input validation, the architecture lacks a formal description of the logic of transitions and processing stages. Possible rollbacks, refinement loops, or the probability of categorization errors are not modeled. The work is valuable as a data improvement tool but requires supplementation with a stochastically described logic of phase transitions, as implemented in the GERT approach.

Foundational works by Adomavicius and Tuzhilin [27] laid the groundwork for classifying recommendation techniques, while Burke [28] introduced the taxonomy of hybrid systems. Ricci, Rokach and Shapira extensively covered practical implementations and domain-specific adaptations [29]. Additionally, Koren et al. [30] proposed matrix factorization methods that redefined collaborative filtering during the Netflix Prize era. These references form the theoretical backbone of modern recommender system research and support the relevance of further modeling studies.

Thus, the analysis of modern approaches to building recommendation systems demonstrates the relevance and significant interest among developers in the issues of high-quality formalization of data and knowledge in this field. However, most of the reviewed solutions, despite their adaptability and explainability, remain architecturally unformalized. They lack a strict step-by-step description of the processes of categorization, validation, and response generation, and do not take into account probabilistic rollbacks, cyclicity, or temporal delays in user query processing.

Existing models do not allow for a quantitative assessment of a system’s resilience to categorization errors, the probability of successful processing within a given time frame, or the identification of the most vulnerable links in the algorithm’s structure. These aspects are critically important in applied systems, such as recommendation platforms in the museum domain, where errors and delays can significantly degrade the user experience.

In this context, a pressing research task is the mathematical formalization of user data processing logic using probabilistic–temporal GERT networks, which enables the following:

–: Precise modeling of the processing structure and its stages, allowing each logical step such as categorization, validation, and response generation to be explicitly represented and analyzed;
–: Consideration of error and rollback probabilities, which is essential in real-world systems where user input may be ambiguous or incomplete and iterative correction cycles are frequent;
–: Quantitative evaluation of time-based performance characteristics, enabling the assessment of response delays, threshold exceedance risks, and overall system responsiveness under uncertainty.

1.3. Objectives and Contributions

The objective of this study is to develop a mathematical model for data processing in a recommendation system designed for personalized search of objects in digital collections. The main focus is on formalizing the architecture for processing user requests using GERT network technology, which allows the incorporation of transition probabilities and time delays at each stage of interaction.

In accordance with this objective, the study addresses the following tasks:

–: A step-by-step algorithmic structure for processing user requests has been developed, including categorization, weight assignment, and result generation;
–: Mathematical models of the main processing stages have been developed using GERT networks, including scheme construction, derivation of transfer functions, numerical recovery of time distributions, and calculation of statistical characteristics;
–: A generalized model integrating all three stages into a single system has been investigated;
–: A comparative analysis of individual stages and the complete model has been conducted to study the time characteristics and the probability of exceeding control thresholds.

2. Materials and Methods

2.1. Algorithmic Structure for Processing User Data in a Recommendation System

The methodological foundation of this study lies in the use of probabilistic-time modeling for simulating the internal architecture of a personalized recommendation system. Central to this modeling approach is the application of GERT, which enables the representation of data processing flows that incorporate probabilistic transitions, return cycles, and time-distributed processing stages.

This formalism was selected due to its capacity to capture the inherent uncertainty and cyclical nature of information refinement processes typical of personalized recommendation environments. Unlike classical modeling approaches such as Petri nets or queuing theory, GERT explicitly supports the inclusion of feedback loops and non-deterministic time delays, offering a more flexible and realistic description of user–system interaction in scenarios characterized by incomplete or evolving input data.

In the course of the research, the user query processing procedure was decomposed into logically distinct stages, each corresponding to a semantically meaningful transformation of the data. For each of these stages, a GERT network was constructed to reflect the structure of transitions and branching logic. The associated transfer functions were then derived using Laplace transforms, which made it possible to describe the time behavior of the system in analytical form. To retrieve the actual temporal distributions, a numerical inversion of the transfer functions was carried out. This enabled the construction of time density and distribution functions, which in turn allowed for the computation of key statistical indicators such as the mathematical expectation, standard deviation, quantiles, and probability of exceeding specified time thresholds.

Such formalization provides insight into how uncertainty and delays propagate through the recommendation process and how they influence the overall responsiveness of the system. Furthermore, this methodological approach facilitates the structural optimization of the recommendation system architecture through quantitative assessment, and it is adaptable to other domains where the logic of data processing involves probabilistic or iterative patterns.

The process of response generation in the context of an intelligent recommendation system is a multi-stage procedure aimed at extracting, structuring, and interpreting user data. It is built on the principle of sequential filtering and information refinement, as follows: from receiving and categorizing the request to its formalization and the generation of a meaningful response. This architecture ensures high result accuracy, minimizes processing errors, and enables the consideration of both user-specific features and the specifics of global reference data.

Input data can come from various sources. These include direct user queries, information about the user from external digital systems, and structured global knowledge bases. The initial textual formulation of the request may be either formal and well-structured or unstructured and ambiguous. External sources provide behavioral and demographic characteristics, activity history, and preferences. As a reference for global data, platforms like Europeana are used, which aggregates multimedia cultural resources with richly described metadata.

The algorithmic structure is divided into three key stages, each implemented as a corresponding submodule formalized in the form of a GERT scheme.

The first stage involves the reception and categorization of the user request. At this point, the system analyzes the initial input, identifies relevant conceptual domains, and attempts to assign the request to predefined categories. This stage also includes mechanisms for reformulation in cases of ambiguity or incorrect classification.

The second stage is dedicated to the assignment and validation of semantic weights for the identified categories. These weights reflect the relative importance or relevance of each category in the context of the current request. This phase includes consistency checks, iterative refinement, and adjustments based on both system policies and inferred user intent.

The third stage focuses on executing the search and generating the final recommendation result. It integrates the categorized and weighted user data with external reference sources to construct a ranked output. Validation of the result, as well as potential feedback-based refinement, is also incorporated at this step.

The sequential execution of these stages ensures accurate classification, correct prioritization, and the generation of results. Visual representations of the stages are shown in Figure 1, Figure 2 and Figure 3.

Figure 1 presents a block diagram of the user request reception and categorization stage.

In the first stage, the original textual input is processed. The request enters the system and undergoes structural analysis, the purpose of which is to assign it to one or several predefined categories. Categorization may be based on heuristic rules, machine learning, or a combination of approaches. If classification is successful, the data proceed to the next module. Otherwise, a return cycle is initiated, whereby the request text is automatically reformulated to eliminate ambiguities and bring it to the required structure. The request is then revalidated. If it still fails, it may be routed for manual moderation or to a specialized processing module. A correctly classified and validated request proceeds to the next stage.

The GERT scheme of the first stage includes a return loop that models repeated correction attempts in the event of validation failure. This reflects real-world situations where the system must adapt to unstructured or erroneous inputs.

Thus, the model structure reflects the initial cycle of input data processing, including category correctness checks and a mechanism for returning to the reformulation stage in case of errors. The scheme includes a return from the validation node to the analysis stage, modeling a repeated attempt to interpret an incorrect request.

In the second stage, prioritization of the information obtained from categorization is performed. Each category is assigned a numerical weight reflecting its importance in the context of the current request. This enables the system to highlight key aspects determining the relevance of the result. Weight validation is performed to ensure acceptable values, consistency, and logic. If discrepancies are detected, the system initiates a return to the weight assignment stage or automatically corrects them.

Figure 2 presents the block diagram of the stage for assigning weights to user data categories.

This stage’s model includes not only a return in case of error but also an additional loop for refinement after the formal completion of the stage. This accounts for the possibility of further clarification even after successful validation—a scenario typical for adaptive or self-learning systems. This element enhances the model’s flexibility and adaptability.

Thus, the scheme includes both main transitions between nodes and branches modeling returns to previous steps in case of validation errors. A unique feature of the model is the additional backward arc after the formal completion of the stage, which reflects the possibility of adjusting weights even after successful validation.

Figure 3 illustrates the block diagram of the final stage, which encompasses request execution and response generation.

At the final stage, the system interprets and executes the formulated request. Using user data, the weight structure, and global reference resources, the system performs search, filtering, and processing of relevant information. The result undergoes final verification, including possible adjustment of global category weights. If necessary, the request returns to the refinement stage. The stage concludes with the generation of the final response and its presentation to the user.

The structure of the GERT scheme for the third stage includes branches representing both successful and unsuccessful result processing, with the possibility of returning to intermediate nodes in case of validation errors. This architecture models a realistic processing cycle, where it is essential not only to interpret results correctly but also to be able to adapt to insufficient reliability or completeness of output data.

Thus, the proposed algorithmic model for processing user requests is based on a sequential, modular architecture with the possibility of cyclical adaptation. Each stage is logically justified and amenable to further quantitative analysis through mathematical modeling. This structure makes it suitable for use in intelligent recommendation systems, particularly those focused on searching for museum and cultural artifacts.

To quantitatively analyze the system’s time behavior, assess delays, and the probability of exceeding critical processing time thresholds, each stage is proposed to be formalized as a GERT network. This formalization takes into account both the algorithm structure and the stochastic nature of the processes, enabling the construction of models that describe the system’s temporal characteristics.

2.2. GERT Model for User Data Processing in the Recommendation System

2.2.1. GERT Model of the First Stage of User Request Processing

The first stage of user request processing is focused on structuring the incoming information, verifying the correctness of categorization, and resolving potential ambiguities. The logic of this process is formalized using a probabilistic graphical model, GERT (Graphical Evaluation and Review Technique), which is illustrated in Figure 4.

In Figure 4, the process stages are schematically represented as a directed graph, where each node corresponds to a processing step, and the arcs represent probabilistic transitions between them. The model includes nodes and transitions. The list of nodes and their formal descriptions are shown in Table 2.

After defining the main nodes of the GERT model for the first stage, it is necessary to examine the structure of transitions between them, reflecting the dynamics of request processing. The transitions between the model’s nodes are probabilistic in nature and are governed by the logic of conditions for passing through or returning.

The initial node (node 1) represents the moment the user’s request is received. From this node, the process moves to node 2, which reflects the initial categorization of the request. This transition occurs via arc W₁₂, which is characterized by a nearly certain probability of passage, as categorization is a mandatory step in processing any request.

After the categorization is completed, the process transitions via arc W₂₃ to node 3, where the assigned category is validated. At this stage, the category is checked for consistency with the user’s request, the data structure, and the expected types of information. If validation is successful, the transition proceeds to node 5 via arc W₃₅, corresponding to the next stage—assigning weights for further prioritization of information.

If the validation fails (e.g., if the category is assigned incorrectly or the request turns out to be ambiguous), an alternative path is activated as follows: via arc W₃₄, the system transitions to node 4, where the request is reformulated or corrected. The purpose of this step is to eliminate ambiguities and bring the request structure to a format suitable for repeated categorization. Then, via arc W₄₂, the process returns to node 2, and the categorization is performed again.

In the case of successful categorization and validation, the system reaches node 5, corresponding to the stage of weight assignment, and then proceeds via arc W₅₆ to the final node 6, which completes the first stage of processing and hands control over to the next module of the system.

Thus, the structure of the GERT graph enables multiple returns and iterations, particularly between nodes 2–3–4, providing flexibility in processing unstructured or erroneous requests and increasing the system’s resilience to user errors.

To analyze the temporal characteristics of the process, classical methods of GERT network analysis can be used. For example, the expected total processing time of a request, E[T], is defined as the sum of the weighted average times over all possible paths leading from the initial to the final node, taking into account the probabilities of traversing each path.

Formally, for a system with possible cycles, this can be expressed through a system of equations incorporating Markov logic or by using the shortest path method with probabilities adapted for GERT. This approach not only allows for the estimation of the average time to complete the stage but also enables the identification of the most resource-intensive and sensitive segments (e.g., excessive accumulation of probability on the return arc W₄₂).

However, in modern GERT models (unlike the more approximate “step-by-step” evaluations in the previous example), the calculation is performed through an equivalent transfer function. This function accounts for the following:

–: transition probability p_ij;
–: intensity or delay (time distribution law)—via the Laplace function for time T_ij.

The result is an equivalent transfer function of the GERT arc:

W_{i j} (s) = p_{i j} \times M_{i j} (s)

(1)

where

W_{i j} (s)

is the Laplace transform of the execution time along arc i

\to

j.

In this expression,

p_{i j}

represents the probability of transitioning from node i to node j, while

M_{i j} (s)

is the Laplace transform of the probability distribution function of the processing time associated with that transition. The function

M_{i j} (s)

encodes the stochastic duration of the corresponding process step in the frequency domain.

One of the tasks at this stage of modeling is the selection of the distribution function for the random variable representing transition execution time. In a number of studies [31], the problem formulation used the exponential distribution for transition times between nodes in the GERT graph. This is due to its mathematical simplicity. The Laplace transform for the exponential distribution is

M (s) = \frac{λ}{λ + s}

, which significantly simplifies analytical calculations.

Nevertheless, despite the popularity of the exponential distribution in queueing theory, its application in modeling systems for processing user requests is not always justified. The exponential distribution has the memoryless property, meaning the probability of completing an operation in the next moment does not depend on how much time has already passed. This is acceptable for modeling asynchronous events such as request arrivals or trigger activations. However, within the system under consideration, stages such as categorization, validation, and weight assignment typically require a fixed or nearly fixed processing time. To model the execution time of individual stages, the Erlang distribution was chosen—a special case of the gamma distribution, applied when the process consists of k sequential phases, each following an exponential distribution. The advantages of mathematical formalization using the Erlang distribution include the following factors.

–: Physical realism, as it reflects the step-by-step structure of tasks (e.g., request parsing $\to$ entity extraction $\to$ category matching);
–: Controllable variance, since for a fixed rate $λ$ , increasing the parameter k makes the distribution closer to deterministic;
–: Positive start time, as there is no probability of zero-time completion.

The Laplace transform for the Erlang distribution is given by the following:

M (s) = (\frac{λ}{λ + s})^{k}

(2)

Taking the presented facts into account, the equivalent time function for the first stage of user request processing is calculated as follows:

W_{E} (s) = \frac{W_{12} W_{23} W_{35} W_{56}}{1 - W_{23} W_{34} W_{42}}

(3)

The use of the proposed equivalent W-function will, subsequently, enable the derivation of an analytical expression for calculating the time required to implement the first stage of user request processing. In addition, this approach allows for determining the complete distribution function of the request traversal time through the model.

The conducted analysis of the process formalized in Figure 1 made it possible to establish the characteristics of the branches considered in the GERT model, as well as the distribution parameters. These are presented in more detail in Table 3.

By substituting the expressions for each arc into expression (3), we obtain the following:

W_{E} (s) = \frac{(p_{1}^{2} p_{2} p_{3}) \times (λ_{1})^{2 k} λ_{2} λ_{3}}{(λ_{1} + s)^{2 k} (λ_{2} + s) (λ_{3} + s)} \times \frac{1}{1 - p_{4} p_{5} p_{1}^{k} \times (λ_{4} + s) (λ_{5} + s) (λ_{1} + s)^{k} λ_{4} λ_{5} λ_{1}^{k}}

(4)

After performing straightforward mathematical transformations, we arrive at the expression for calculating the equivalent W-function.

W_{E} (s) = \frac{p_{1}^{2} p_{2} p_{3} λ_{2} λ_{3} (\frac{λ_{1}^{2}}{λ_{1} + s})^{k} (λ_{4} + s) (λ_{5} + s)}{(λ_{2} + s) (λ_{3} + s) ((p_{1} λ_{1})^{k} p_{4} p_{5} λ_{4} λ_{5} - (λ_{4} + s) (λ_{5} + s) (λ_{1} + s)^{k})}

(5)

The resulting transfer function of the entire network

W_{E} (s)

represents the characteristic function of the request processing time. It is the Laplace transform of the desired probability density function of the processing time

f_{T} (t)

. Formally, the relationship between the transfer function and the time distribution function is expressed as follows:

Φ_{T} (s) = W_{E} (s),

where

Φ_{T} (s)

is the characteristic function of the time.

The probability density function

f_{T} (t)

can be obtained from

W_{E} (s)

through the inverse Laplace transform. Theoretically, this transformation is written in the form of the Bromwich integral, as follows:

f_{T} (t) = \frac{1}{2 π i} \int_{γ - i \infty}^{γ + i \infty} e^{s t} W_{E} (s) d s

(6)

where the integration is carried out along a vertical line in the complex plane, located to the right of all singularities of the function

W_{E} (s)

.

In practice, analytical computation of the Bromwich integral for complex transfer functions, such as our model, is challenging. Therefore, for the numerical recovery of the time distribution function, it is reasonable to use specialized numerical methods. In this study, the Stehfest method [32] was used to solve the inversion problem, as recommended in works on the theory of GERT networks [1,33].

The Stehfest method allows the function

f_{T} (t)

to be approximated through a finite sum of transfer function values at various arguments. The following approximate formula is used:

f_{T} (t) \approx \frac{l n (2)}{t} \sum_{k = 1}^{N} V_{k} W_{E} \frac{k l n (2)}{t}

(7)

where N is an even number that determines the approximation accuracy (typically N = 10, 12, and 14), and

V_{k}

is the special Stehfest weighting coefficient. These coefficients are calculated using binomial coefficients according to the following formula:

V_{k} = \sum_{j = 2 k + 1}^{m i n (k, \frac{N}{2})} \frac{j^{\frac{N}{2}} (\binom{\frac{N}{2}}{j}) (\binom{2 j}{j})}{(\binom{j}{k - 1}) (\binom{j}{\frac{N}{2} - j})}

(8)

where

(\binom{a}{b})

denotes binomial coefficients.

The implementation of the method is as follows: for each fixed time value t, the weights

V_{k}

and the values of the function

W_{E} (s)

at the points

s = \frac{k l n (2)}{t}

are calculated. These are then used to compute a weighted sum, normalized by a multiplier of

\frac{l n (2)}{t}

. By repeating this procedure for multiple values of t within a specified range, a discrete approximation of the function

f_{T} (t)

can be obtained.

The resulting probability density function

f_{T} (t)

allows for not only for determining the expected request processing time but also for analyzing the probability of meeting specified deadlines, constructing confidence intervals for processing time, and assessing the risk of critical delays.

The constructed transfer function (5) of the GERT model for the first stage of user request processing takes the form of a fractional-rational expression, as follows:

W_{E} (s) = \frac{P (s)}{Q (s)}

(9)

where P(s) and Q(s) are polynomials in the variable s. To obtain the analytical expression of the probability density function

f_{T} (t)

, it is necessary to perform the inverse Laplace transform of

W_{E} (s)

. However, for fractional-rational functions, direct inversion is difficult. Therefore, a standard solution procedure is applied, as recommended in works [25].

The process of recovering the function

f_{T} (t)

consists of several key steps.

In the first step, a change of variable is made,

s = - z,

as a result of which the transfer function takes the following form:

W_{E} (- z) = \frac{P (- z)}{Q (- z)}

(10)

This substitution is used to facilitate the subsequent analysis of the function’s singularities in the complex plane. After the variable change, the roots of the equation

Q (- z) = 0

are located in the right half-plane, which simplifies their interpretation and makes the subsequent decomposition of the function into partial fractions easier.

Using expression (5), we obtain the following equality:

W_{E} (s) = \frac{- p_{1}^{2} p_{2} p_{3} λ_{2} λ_{3} (\frac{λ_{1}^{2}}{λ_{1} - z})^{k} (λ_{4} - z) (λ_{5} - z)}{(λ_{2} - z) (λ_{3} - z) ((p_{1} λ_{1})^{k} p_{4} p_{5} λ_{4} λ_{5} - (λ_{4} - z) (λ_{5} - z) (λ_{1} - z)^{k})}

(11)

The next step is solving the equation

Q (- z) = 0

.

The roots of this equation

z_{i}

may be real numbers (real roots) or form pairs of complex conjugates. Each root can be either simple (multiplicity 1) or multiple (multiplicity

m_{i}

> 1).

Finding the roots of the denominator is a key step, as each root determines the behavior of the corresponding component in the result after the inverse transformation.

Once all roots

z_{i}

of the denominator are found, the fraction is decomposed

\frac{P (- z)}{Q (- z)}

into a sum of simple elements. According to standard partial fraction theory, any rational expression can be represented as a sum of the following:

W_{E} (- z) = \sum_{i} \sum_{k = 1}^{m_{i}} \frac{A_{i k}}{(z - z_{i})^{k}}

(12)

where

m_{i}

is the multiplicity of the root

z_{i}

, and

A_{i k}

are the decomposition coefficients.

The coefficients

A_{i k}

are determined using standard methods. This may involve comparing polynomial coefficients after bringing them to a common denominator, or using the method of equivalent residues at the poles (residue method).

If the root

z_{i}

is simple (

m_{i}

= 1), the corresponding term of the fraction has the form

\frac{A_{i k}}{(z - z_{i})}

. If the root is multiple, all degrees from

(z - z_{i})^{- 1}

to

(z - z_{i})^{- m_{i}}

are taken into account.

In the next stage of the process of recovering the function

f_{T} (t)

, known results of the inverse Laplace transform [24] can be used for each term of the decomposition.

For a fraction of the form

\frac{1}{(z - z_{i})}

the inverse transform has the following form:

L^{- 1} (\frac{1}{z - z_{i}}) (t) = e^{z_{i} t}

(13)

For a fraction of the form

\frac{1}{(z - z_{i})^{k}}

(multiple root), as follows:

L^{- 1} (\frac{1}{(z - z_{i})^{k}}) (t) = \frac{t^{k - 1}}{(k - 1)!} e^{z_{i} t}

(14)

Thus, each pole (root) contributes to the final distribution function in the form of an exponential, possibly multiplied by a power function of time.

At the next stage, by summing the results of the inverse transform for all terms of the decomposition, we obtain the complete expression for the probability density function of request processing time, as follows:

f_{T} (t) = \sum_{i} \sum_{k = 1}^{m_{i}} A_{i k} \frac{t^{k - 1}}{(k - 1)!} e^{z_{i} t}

(15)

Each term represents an exponential function of time multiplied by a power function

t^{k - 1}

, weighted by the coefficient

A_{i k}

. In the case where the roots of the denominator are complex conjugate numbers, their contribution to the function

f_{T} (t)

will be real. This is ensured by the fact that the sums of exponentials of conjugate roots result in cosine and sine functions via Euler’s formulas.

For example, if

z_{i} = α + i β

and

{\tilde{z}}_{i} = α - i β

, then the following occurs:

e^{z_{i} t} + e^{{\tilde{z}}_{i} t} t = 2 e^{α t} \cos (β t)

(16)

and the contribution of the function will be expressed through damped oscillations.

Let us conduct experimental studies of the presented mathematical model of the first stage of user request processing.

We will determine the probability density function of the processing time for the first stage of user request handling. In doing so, we will define the following branch parameters of the GERT network as initial data:

p_{1} = p_{3} = p_{5} = 0.9999

,

p_{2} = 0.7

,

p_{4} = 0.3

,

λ_{1} = 1

,

λ_{2} = 0.9

,

λ_{3} = 0.8

,

λ_{4} = 0.85

,

λ_{5} = 0.91

, and

k = 2

. Substituting the specified values into Expression 4, we obtain the following form of the equation

Q (- z) = 0

:

Q (- z) = - z^{6} + 5.46 z^{5} - 12.4055 z^{4} + 15.01315 z^{3} - 9.9747 z^{2} + 3.3016 z - 0.3899

The real roots of this equation are

z_{1} = 0.2430

,

z_{2} = 0.8000

,

z_{3} = 0.9000

, and

z_{4} = 1.6369

. The complex conjugate roots of this equation are

z_{5} = 0.9401 - 0.6911 i

,

z_{6} = 0.9401 + 0.6911 i

. After finding the roots of the equation

Q (- z) = 0

, the next step is the numerical determination of the probabilistic characteristics of the request traversal time through the model of the first stage. For this, the inverse Laplace transform method is used, which allows obtaining the probability density function

f_{T} (t)

based on the transfer function of the GERT network.

Based on the obtained density function, the key statistical parameters of the first stage were calculated.

–: Mathematical expectation: $E [T] = 29.294$ ;
–: Standard deviation: $σ = 0.707$ .

These values provide a quantitative estimate of the average duration of user request processing and the degree of time value dispersion. Further research will make it possible to compare these results with the characteristics of the second and third stages, as well as to build an aggregated model of the total execution time.

2.2.2. GERT Model of the Stage for Assigning Quantitative Weights to Identified Categories

After completing the stage of initial categorization of the user request, the system proceeds to assign quantitative weights to each of the identified categories.

This process is essential for determining the prioritization of various aspects of the request during the subsequent response generation.

The second stage includes a sequence of operations organized according to the logic presented in Figure 5 as a GERT scheme.

The list of nodes and their formal descriptions is presented in Table 4.

In the scheme presented in Figure 2, the arcs between nodes formalize the following state-to-state transitions.

The arc

W_{12}

characterizes the start of the weight assignment process, the transition from the initial state to category analysis. The arc

W_{23}

illustrates the transition to verifying the correctness of the assigned weights. The arc

W_{34}

characterizes the transition to the weight correction procedure in case of an error. Accordingly, arc

W_{42}

illustrates the return to weight reassignment in case of an error.

The transition and arc

W_{35}

characterize successful validation, after which two outcomes are possible. The first is the completion of the stage, the second is the clarification-based reassignment of weights. The second option is characterized by arc

W_{52}

. Introducing the additional loop

W_{52}

increases the model’s accuracy, bringing it closer to real-world scenarios, where even after the formal completion of a stage, returning to previous steps may occur due to changes in the request context, new requirements from the system or user, and inconsistencies between the assigned weights and current processing policies. This is especially relevant in adaptive or learning systems, where weight coefficients may be sensitive to the dynamics of input data.

Taking into account the presented characteristics of the transitions in the GERT network for the stage of establishing quantitative importance weights for each identified category, the equivalent transfer function of the scheme (Figure 2) takes the following form:

W_{E} (s) = W_{12} (s) W_{23} (s) \frac{W_{35} (s)}{1 - W_{34} (s) W_{42} (s) W_{23} (s) - W_{52} (s) W_{23} (s)}

(17)

The conducted analysis of the process formalized in Figure 5 made it possible to establish the characteristics of the branches considered in the GERT model and the distribution parameters. These are presented in more detail in Table 5.

For each branch of the GERT network of the second stage, types of time distributions were selected that reflect the specifics of the corresponding operations. The basis for this selection is the structure of the algorithm itself, the type of transition, and the nature of processing inherent to each stage.

In particular, the most complex and logically intensive processes—weight assignment

W_{12}

and its validation

W_{23}

—are modeled using an Erlang distribution of order

k = 2

, reflecting the two-phase nature of these steps. This makes it possible to account for delays caused by the need to go through a series of logical checks or calculations.

Other arcs correspond to actions typically performed in one step or within the framework of simple administrative logic (for example, revalidation, error return, and processing completion). For them, the exponential distribution was selected, reflecting the memoryless (Markovian) nature of processing [33].

Thus, the model accounts for the varying temporal complexity of different stages, ensuring more realistic behavior of the processing time function at this stage.

The process of restoring the probability density function of processing time

f_{T} (t)

from the transfer function of the GERT network

W_{E} (s)

begins with a transition to the complex frequency domain by substituting the variable

s = - z

.

The transfer function then takes the following form:

W_{E} (- z) = W_{12} (- z) W_{23} (- z) \frac{W_{35} (- z)}{1 - W_{34} (- z) W_{42} (- z) W_{23} (- z) - W_{52} (- z) W_{23} (- z)}

(18)

where each component function

W_{i j} (- z)

is calculated taking into account the variable substitution and the corresponding form of the Laplace transform.

In this GERT model, the denominator has the following structure:

Q (z) = (λ_{1} - z)^{k} (λ_{2} - z)^{k} (λ_{3} - z) D (z)

(19)

where

D (z)

is the resulting expression, as follows:

D (z) = 1 - (\frac{p_{4} λ_{4}}{λ_{4} - z} \frac{p_{5} λ_{5}}{λ_{5} - z} (\frac{p_{2} λ_{2}}{λ_{2} - z})^{k}) - (\frac{p_{6} λ_{6}}{λ_{6} - z} (\frac{p_{2} λ_{2}}{λ_{2} - z})^{k})

(20)

Thus, the poles

z_{i}

are defined as the solutions to the following equation:

Q (z) = 0

The roots found

z_{i} \in

(real or complex conjugate) determine the behavior of the function

f_{T} (t)

, its shape, symmetry, and the “tail” of the distribution, as well as the position of the expected value and the nature of the exceedance probability.

The procedure for computing the roots is similar to the first example (model 12–16) and consists of the following steps.

The function $W_{E} (s)$ is formed using all parameters $p_{i}$ , $λ_{i}$ with specific values substituted.
For each time value $t \in [0, T_{m a x}]$ , an approximate value of $f_{T} (t)$ is calculated.
Based on the array of $f_{T} (t)$ , the following are constructed:
–
the density function;
–
the distribution function $F_{T} (t) \int_{0}^{t} f_{T} (τ) d τ$ ;
–
the exceedance probability $1 - F_{T} (t)$ ;
–
the mathematical expectation and variance:

E [T] = \int_{0}^{\infty} t f_{T} (t) d t

V a r (T) = \int_{0}^{\infty} (t - E [T])^{2} f_{T} (t) d t

After substituting the specified model parameters and performing the change

s = - z

, the characteristic equation of the GERT network for the second stage takes the following form:

Q (- z) = 0.3019 z^{8} - 2.3246 z^{7} + 7.4991 z^{6} - 13.0734 z^{5} + 13.2067 z^{4} - 7.6483 z^{3} + 2.3171 z^{2} - 0.2878 z + 0.0096

The following parameters were used in the calculations:

p_{1} = p_{2} = 0.95

,

p_{3} = 0.6

,

p_{4} = 0.5

,

p_{6} = 0.3

,

λ_{1} = 1.4

,

λ_{2} = 1.3

,

λ_{3} = 1.4

,

λ_{4} = 0.3

,

λ_{5} = 0.5

,

λ_{6} = 0.1

, and

k = 2

. For numerical analysis of the dynamics of time characteristics, it is necessary to determine the roots of the equation

Q (- z) = 0

. The results of solving this polynomial are presented below. Real roots:

z_{1} = 1.4657

,

z_{2} = 1.3998

,

z_{3} = 0.1813

,

z_{4} = 0.0515

. Complex conjugate roots:

z_{5} = 1.4001 + 0.0002 i

,

z_{6} = 1.4001 - 0.0002 i

,

z_{7} = 0.9007 + 0.1957 i

, and

z_{8} = 0.9007 - 0.1957

.

The obtained root values define the structure of the exponential components of the probability density function for the stage traversal time. They are further used in the numerical Laplace inversion process and in calculating model characteristics.

Based on the resulting density function, the key statistical parameters of the second stage of establishing quantitative importance weights for each identified category were calculated.

–: Mathematical expectation: $E [T] = 29.318$ ;
–: Standard deviation: $σ = 0.682$ .

2.2.3. GERT Model of the Final Result Generation Stage

After the assignment and validation of user data weights, the system proceeds to the third stage of request processing—the generation of the final result. The purpose of this stage is to combine the previously obtained formalized data and weighting coefficients into a unified, consistent result, suitable for presentation to the user or for transfer to subsequent processing stages.

The logic of the stage is shown in Figure 6 as a GERT scheme, which reflects the sequence of operations and possible return cycles characteristic of this process.

The GERT network consists of five nodes and six arcs, each corresponding to a separate processing stage. The arc

W_{12}

characterizes the transition from data acquisition to system analysis of weights. Transition

W_{23}

characterizes the computation of the preliminary result. The arc

W_{35}

illustrates the transition to finalizing the result upon successful verification. The arc

W_{34}

indicates the possibility of sending data for correctness checking. Branch

W_{42}

formally represents a return to weight reassessment in case a logical error is detected. The arc

W_{43}

characterizes a return to result recalculation if a technical or formal error in computations is identified.

This approach allows for consideration of various failure scenarios, for example, logical failures due to incorrect interpretation of weights, or computational failures due to issues at the result generation level.

The list of nodes and their formal descriptions is presented in Table 6.

The transfer function of the model takes the following form:

W_{E} (s) = W_{12} (s) W_{23} (s) \frac{W_{35} (s)}{1 - [W_{34} (s) (W_{42} (s) W_{23} (s) + W_{43} (s))]}

(21)

The conducted analysis of the process formalized in Figure 6 made it possible to establish the characteristics of the branches considered in the GERT model and the distribution parameters. These are presented in more detail in Table 7.

By performing mathematical transformations similar to the examples from stages 1 and 2, we obtain the following data. As input values, we take the following probability parameters:

p_{1} = p_{2} = 0.85

,

p_{3} = 0.6

,

p_{4} = 0.4

,

p_{5} = 0.3

, and

p_{6} = 0.4

. We also select intensities corresponding to the final result generation process, as follows:

λ_{1} = 1.2

,

λ_{2} = 1.3

,

λ_{3} = 0.5

,

λ_{4} = 0.6

,

λ_{5} = 0.3

,

λ_{6} = 0.4

, and

k = 2

.

Substituting the variable

s = - z

and constructing the characteristic equation of the form

Q (- z) = 0

leads to obtaining a sixth-degree characteristic equation, as follows:

Q (- z) = 0.1966 z^{6} - 1.4016 z^{5} + 4.4368 z^{4} - 7.9619 z^{3} + 8.2982 z^{2} - 4.9489 z + 0.9690

The solution to this equation includes real roots

z_{1} = 0.1630

,

z_{2} = 0.9057

, and

z_{3} = 1.2015

and complex conjugate roots

z_{4} = 1.0621 \pm 0.1215 i

and

z_{5} = 0.9350 \pm 0.1830 i

.

Reconstructing the probability density function

f_{T} (t)

using the inverse Laplace transform method (e.g., Stehfest’s algorithm), we obtain the following:

–: Mathematical expectation: $E [T] = 29.460$ ;
–: Standard deviation: $σ = 0.362$ .

These results indicate high stability of this stage, with moderate internal cycle complexity and localized potential for result refinement.

The step-by-step analysis carried out above made it possible to describe in detail the logical structure, probabilistic characteristics, and temporal dependencies of each of the three stages of user request processing.

To construct the complete system model, it is necessary to combine these three components into a unified GERT model. It is assumed that each stage is performed sequentially, without overlap, and the output of one stage serves as the input for the next. Thus, the entire model is interpreted as a sequential connection of three stochastic subnetworks, each of which has its own transfer function.

The overall transfer function of the combined model is defined as the product of the transfer functions of the individual stages, as follows:

W_{f u l l} (s) = W_{E} (1) (s) \times W_{E} (2) (s) \times W_{E} (3) (s)

(22)

where

W_{E} (1) (s)

is the transfer function of the GERT model for the first stage of user request processing,

W_{E} (2) (s)

is the transfer function of the GERT model for the second stage of establishing quantitative importance weights for each identified category, and

W_{E} (3) (s)

is the transfer function of the GERT model for the third stage of final result generation.

Forming the complete model allows transitioning from the analysis of local algorithm fragments to the evaluation of the integrated response time of the entire system to a user request, including all possible iterations, returns, and variations within each stage.

To parameterize the GERT models for each stage of data processing, we introduced transition probabilities and processing intensities that reflect the typical uncertainty and feedback mechanisms observed in personalized recommendation systems. The values of key parameters were selected based on a combination of domain knowledge, structural features of recommendation workflows, and observations collected during preliminary experimental deployments in Polish cultural heritage institutions. These observations included time delays in user interaction, frequency of semantic query reformulations, and branching behavior in user navigation patterns.

For example, higher transition probabilities (e.g., 0.9–0.95) were assigned to primary processing paths, while lower probabilities (0.3–0.6) reflected occasional backward transitions or alternate paths associated with query refinement or user context updates. Processing intensities (λ) ranged from 0.1 to 1.4, modeling various delay profiles such as metadata enrichment or real-time user personalization.

To evaluate the robustness of the models under parameter variability, we conducted a sensitivity analysis by varying probabilities and intensities by ±10%, ±20%, and ±30%. Statistical outputs including mean processing time, standard deviation, and tail quantiles were recalculated at each step. While absolute values shifted moderately, the overall behavioral profile of the system remained stable, confirming the reliability and interpretability of the chosen parameterization approach. As illustrated in Figure 7, the core distribution profile remained consistent across variations, confirming the model’s stability and interpretability.

To perform the numerical inversion of Laplace transforms and simulate the probability density functions of execution time, custom Python scripts were developed using libraries such as SymPy 1.12, NumPy 1.25.0, and mpmath 1.3.0. The inversion was implemented using Talbot’s method for stability across a wide domain, while comparative tests were also run using Stehfest’s approximation.

All experiments were conducted on a local workstation running Windows 11 with an Intel Core i7 processor (2.6 GHz), 32 GB of RAM, and Python 3.11 environment. No GPU acceleration was used. The average execution time for processing each model ranged from 12 to 24 s depending on time discretization.

3. Results

Numerical Analysis of the Time Characteristics of User Request Processing Stages and the Integrated Model

Through numerical analysis and study of the GERT models for the stages of user request processing and the integrated model, a series of graphs was constructed to enable a comprehensive analysis of the system’s temporal behavior. The calculations used parameters obtained through structural analysis of the network, including transition probabilities along the arcs, processing intensities, and the depth of categorical analysis.

As part of the numerical analysis of the GERT models describing the sequential stages of user request processing, a visualization of the time characteristics of each individual stage, as well as their combined effect within the integrated model, was performed. Based on the transfer functions constructed using parameters of transition probabilities, processing intensities, and the depth of categorical analysis, probability density functions of execution time, distribution functions, and exceedance probabilities of a specified time threshold were calculated.

Figure 8 presents the graphs of the probability density functions of execution time for the respective stages and the integrated model as a whole.

Probability density plots of execution time for the user request processing stages and the integrated model as a whole (Figure 8).

On the probability density function graph

f_{T} (t)

(Figure 8), distinct differences between the time profiles of individual stages are visually noticeable. Stage 3 shows the steepest decline in density, indicating a greater concentration of probabilities near shorter time values. This stage completes more quickly compared to the others. Stages 1 and 2 are characterized by more “stretched” curves, reflecting greater variability in their execution time. The full model (shown by the bold red line) demonstrates averaged behavior, smoothed due to the composition of all three processes, with lower density near the start and a broader “tail”.

Figure 9 presents the distribution function graphs of execution time for the stages under consideration and for the integrated model as a whole.

The distribution function

F_{T} (t)

(Figure 9) confirms the previously drawn conclusions: for Stage 3, the value of

F_{T} (t)

reaches the 0.95 level faster than for Stages 1 and 2. This means that with 95% probability, the third stage will be completed faster than the other two. For the full model, the curve FT(t) lies below the others throughout the entire time interval, reflecting the overall inertial effect, indicating that the resulting behavior is slower due to the accumulated impact of all stages.

Figure 10 presents the graphs of the exceedance probability of execution time for user request processing and the integrated model.

The graph of the exceedance probability of execution time

1 - F_{T} (t)

(Figure 10), presented on a logarithmic scale, clearly demonstrates the exponential decline in the probability of process incompletion as time increases. The highest exceedance probability values are observed for the full model, especially in the interval

t = 30 \div 33

, indicating the overall effect of “accumulated uncertainty” when combining several sequential stages.

Statistical processing of the results made it possible to obtain values of the mathematical expectation and standard deviation for each stage and the full model. The average execution time of the stages ranges from 29.00 (Stage 3) to 29.71 (Stage 2), while for the full model it reaches 29.80. The standard deviation across all models is approximately 2.2–2.3, indicating a comparable level of value dispersion. The 95% quantile values are also close, as follows: from 33.09 (Stage 3) to 33.52 (full model), confirming good convergence of time characteristics despite differences in the structure of the models.

Thus, the results of numerical modeling confirm that each of the models contributes characteristic features to the overall temporal structure of the process, and their combination within the full model leads to smoothed behavior with a slightly more pronounced delay and a lower initial probability of rapid completion. This is consistent with the logic of sequential request processing in systems where each subsequent stage depends on the completion of the previous one.

From a practical standpoint, the obtained graphs and statistical characteristics represent valuable material for forecasting the performance and reliability of a future museum exhibit recommendation system. For example, the probability density of the processing time for each stage allows us to assess which components of the algorithm require additional optimization. If, for instance, Stage 2 shows a higher mean time and variance, this may indicate excessive complexity in the weight assignment procedure or insufficient adaptability in the filtering system.

The exceedance function of execution time is especially important for tasks focused on ensuring user experience quality. If the probability that the system fails to generate a recommendation within 3 s exceeds 5%, this may be considered a critical threshold requiring intervention. Analysis of these graphs makes it possible to establish acceptable SLAs (Service Level Agreements) for each stage of the recommendation system’s operation.

It can also be noted that the obtained results provide a basis for adaptive load balancing. For example, if it is known that Stage 3 almost always completes faster than the others, it can be executed in parallel or used for preliminary candidate filtering, reducing the system’s overall response time.

More broadly, such a model can be used to evaluate and compare alternative system architectures, where stages can be redistributed, merged, or replaced. This is critically important for systems processing large volumes of heterogeneous user requests, including those with time constraints or limited exhibit availability.

To validate the advantages of the GERT-based approach, we conducted a comparative experiment with a traditional Markov model using the same dataset and parameters. The Markov model was constructed by discretizing the processing stages into states with fixed transition probabilities, omitting feedback loops and time distributions. The results revealed that the GERT model provided more accurate estimates of processing times, especially in scenarios with iterative refinements. For instance, the Markov model underestimated the 95% quantile of total processing time by 15% compared to the GERT model, which accounted for feedback loops and probabilistic delays. A comparison between the GERT-based model and a Markov-based model using the same parameters is presented in Table 8.

This demonstrates the superiority of GERT in capturing the complexity of real-world recommendation systems, particularly in domains like cultural heritage where iterative user interactions are common.

In the context of personalized recommendation systems for cultural heritage platforms, temporal characteristics such as processing delay are crucial for maintaining acceptable user experience. Previous empirical studies [10,34,35] have demonstrated that user satisfaction with mobile museum applications drops significantly when response times exceed 4–5 s. In our study, we additionally conducted pilot interviews and observational feedback sessions within ongoing prototype deployments in Polish museum environments. The collected data indicate that most users consider 2–3 s as acceptable waiting time, while delays beyond 4.5 s begin to noticeably reduce perceived system responsiveness. The GERT-based probabilistic modeling enables us to quantify these effects by calculating exceedance probabilities for delay thresholds. For instance, in the integrated model, the probability that a user will wait more than 4.5 s remains below 5% across tested parameter ranges. This confirms that the designed system maintains temporal acceptability under uncertainty and supports iterative performance tuning during system integration.

4. Discussion of Results

The results of the numerical modeling, performed based on the developed GERT models, made it possible to comprehensively characterize the time parameters of user request processing in a recommendation system focused on the search for museum and cultural objects. The main attention was given to a step-by-step analysis of data processing logic and its probabilistic-temporal structure.

It was shown that each stage has its own specific execution time distribution. The request categorization stage is characterized by moderate variance and average delay, reflecting the complexity of structural analysis and the possibility of returns due to classification errors. The weight assignment stage demonstrates greater variability and the highest average execution time, which may be associated with the logical complexity of correctness validation procedures and the potential for repeated iterations. The final stage, result generation, is distinguished by lower dispersion and a more compact distribution, due to a more deterministic computation structure.

The integrated model exhibits averaged behavior, as follows: increased mean delay and the accumulation of a “long tail” in the distribution. This indicates the cumulative effect of uncertainties, especially in the presence of return cycles. The obtained values of mathematical expectation and standard deviation provide a quantitative basis for assessing the risks of exceeding critical time thresholds—both at individual stages and in the system as a whole.

From a practical point of view, this means that even with high result quality, the system may encounter issues with timely response. For applied tasks, such as in a museum recommendation system, this requires additional optimization of the iterative processing logic, especially at the second stage, as well as the possible introduction of parallel processing or predictive filtering to reduce the load on critical nodes of the model.

In addition, the use of the GERT approach made it possible not only to formalize the processing structure but also to propose a numerically stable analysis procedure applicable to systems of various architectures. This makes the developed methodology universal for evaluating the temporal characteristics of algorithms in other areas of personalized search.

In the future, this approach can be expanded through modeling stochastic dependencies between stages, optimizing the network structure based on stability criteria, or implementing methods of adaptive modification of the processing structure depending on the complexity of the request. It also seems promising to use the obtained characteristics for dynamic SLA management (response time, acceptable delay levels, etc.) in intelligent user interaction systems.

Thus, the results obtained in this work confirm the relevance of applying probabilistic-temporal modeling based on GERT networks as a tool for structural and quantitative analysis of algorithms in recommendation systems, particularly in the context of personalized search within museum collections.

The comparative analysis with the Markov model highlighted the limitations of traditional approaches in handling feedback loops and probabilistic delays. While the Markov model provided simpler computations, it failed to account for the iterative nature of user interactions, leading to underestimations of processing times. The GERT model, with its ability to model cycles and time distributions, offered a more realistic representation of system behavior, making it better suited for recommendation systems in dynamic environments like cultural heritage.

While the GERT approach provides significant advantages in modeling iterative processes, it has certain limitations.

Computational Complexity. The inversion of Laplace transforms and numerical analysis of transfer functions require non-trivial computational resources, especially for large-scale systems with high concurrency. This may affect real-time performance in scenarios with thousands of simultaneous requests.
Scalability. The current model assumes sequential stage processing. Parallelization of stages (e.g., categorization and weight assignment) could improve efficiency but would necessitate extensions to the GERT formalism.

These limitations suggest that the model is best suited for systems where processing delays are tolerable (e.g., cultural heritage recommendations) rather than ultra-low-latency applications. Future work could explore hybrid architectures combining GERT with queueing theory for high-load scenarios.

To assess the robustness of the developed GERT-based models under parametric uncertainty, a sensitivity analysis was conducted by systematically varying the input parameters. Specifically, the transition probabilities and processing intensities for each stage were independently perturbed by ±10%, ±20%, and ±30% from their baseline values.

For each perturbation level, the composite model was recalculated, and the resulting probability density functions were analyzed to extract the following key statistics: expected processing time, standard deviation, and 95% quantiles. The results showed that although absolute values of these metrics shifted within a reasonable range (e.g., the expected processing time changed by no more than ±1.5 units), the overall shape of the density functions and the relative dynamics between stages remained consistent.

This confirms that the system demonstrates stable behavior under moderate parameter fluctuations and validates the use of heuristic parameter estimation in contexts where precise measurements are not available. A representative visualization of the sensitivity experiment is presented in Figure 7, highlighting the minor impact on distribution shape across variations.

5. Conclusions

This study proposed a formal probabilistic-time model for simulating the behavior of a personalized recommendation system tailored to the domain of digital collections. The approach is based on GERT networks, which provide a structured framework for modeling probabilistic transitions, loops, and processing delays within multistage recommendation pipelines.

Three stages of data processing, namely, categorical analysis, semantic filtering, and personalization, were modeled separately using dedicated GERT schemes, and an integrated model was constructed by combining them. For each stage, the transfer function was derived analytically, and the time-domain probability density function was restored through numerical Laplace inversion. This enabled the estimation of cumulative distribution functions and tail probabilities related to processing time.

The numerical results demonstrate the effectiveness of the proposed model in quantifying the temporal behavior of the system. Specifically, for the first stage, the expected processing time was estimated as

E [T] = 29.29

, standard deviation

σ = 0.71

; for the second stage:

E [T] = 29.32

,

σ = 0.68

; and for the second stage:

E [T] = 29.46

,

σ = 0.36

. The full integrated model yielded an expected time of

E [T] = 29.80

, and a standard deviation of

σ = 2.30

, reflecting the accumulation of uncertainty across all processing stages.

The results confirm that GERT-based models provide a viable tool for evaluating the performance of recommendation systems in terms of processing time and reliability. This modeling framework can serve as a foundation for further research in adaptive tuning of recommendation workflows, especially in domains that require traceable and delay-sensitive data processing such as cultural heritage applications.

The comparative analysis with the Markov model further validated the advantages of the GERT-based approach, demonstrating its ability to accurately model complex, iterative processes inherent in recommendation systems.

Author Contributions

Conceptualization, S.S.; methodology, S.S. and W.B.; software, M.M.; validation, S.S., W.B., and M.A.; formal analysis, O.S., I.P., and M.A.; investigation, S.S. and W.B.; data curation, M.P., O.S., and I.P.; writing—original draft preparation, S.S.; writing—review and editing, S.S.; visualization, A.M. and W.B.; supervision, S.S.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Semenov, S.; Zhang, L.; Cao, W.; Bulba, S.; Babenko, V.; Davydov, V. Development of a Fuzzy GERT-Model for Investigating Common Software Vulnerabilities. East. Eur. J. Enterp. Technol. 2021, 6, 6–18. [Google Scholar] [CrossRef]
Li, Z.; Nie, X.; Wang, B.; Fan, T. Analysis of the Transmission of Project Duration and Cost Impacts Based on the GERT Network Technique. Symmetry 2019, 11, 337. [Google Scholar] [CrossRef]
Harper, F.M.; Konstan, J.A. The MovieLens Datasets: History and context. ACM Trans. Interact. Intell. Syst. 2015, 5, 19. [Google Scholar] [CrossRef]
Koren, Y.; Bell, R.; Volinsky, C. Matrix factorization techniques for recommender systems. Computer 2009, 42, 30–37. [Google Scholar] [CrossRef]
Zhang, S.; Yao, L.; Sun, A.; Tay, Y. Deep Learning Based Recommender System: A Survey and New Perspectives. ACM Comput. Surv. 2019, 52, 5. [Google Scholar] [CrossRef]
Quadrana, M.; Cremonesi, P.; Jannach, D. Sequence-Aware Recommender Systems. ACM Comput. Surv. 2018, 51, 66. [Google Scholar] [CrossRef]
Fayyaz, Z.; Ebrahimian, M.; Nawara, D.; Ibrahim, A.; Kashef, R. Recommendation Systems: Algorithms, Challenges, Metrics, and Business Opportunities. Appl. Sci. 2020, 10, 7748. [Google Scholar] [CrossRef]
Li, Y.; Liu, K.; Satapathy, R.; Wang, S.; Cambria, E. Recent Developments in Recommender Systems: A Survey. IEEE Comput. Intell. Mag. 2024, 19, 78–95. [Google Scholar]
Dam, N.; Dinh, T. A Literature Review of Recommender Systems for the Cultural Sector. In Proceedings of the 22nd International Conference on Enterprise Information Systems (ICEIS), Prague, Czech Republic, 5–7 May 2020; Volume 1, pp. 715–726. [Google Scholar]
Loboda, O.; Nyhan, J.; Mahony, S.; Romano, D.; Terras, M. Content-Based Recommender Systems for Heritage: Developing a Personalised Museum Tour; DSRS-Turing: London, UK, 2019. [Google Scholar]
Wang, Y.; Aroyo, L.; Schreiber, G.; Nack, F. Recommendations Based on Semantically Enriched Museum Collections. J. Web Semant. 2008, 6, 283–290. [Google Scholar]
Erdem, E.E.; Orman, G.K. On the role of user segmentation in recommender systems’ performances. Procedia Comput. Sci. 2023, 225, 2333–2342. [Google Scholar] [CrossRef]
Jung, H.; Park, H.; Lee, K. Enhancing Recommender Systems with Semantic User Profiling through Frequent Subgraph Mining on Knowledge Graphs. Appl. Sci. 2023, 13, 10041. [Google Scholar] [CrossRef]
Leiva, W.L.; Li, M.-L.; Tsai, C.-Y. A Two-Phase Deep Learning-Based Recommender System: Enhanced by a Data Quality Inspector. Appl. Sci. 2021, 11, 9667. [Google Scholar] [CrossRef]
Nadeem, F.; Suleman, M. Developing a Personalized Recommender System for Enhanced User Experience in Mobile Applications; ResearchGate: Berlin, Germany, 2024. [Google Scholar] [CrossRef]
Ma, X.; Li, M.; Liu, X. Advancements in recommender systems: A comprehensive analysis based on data, algorithms, and evaluation. Int. J. Ind. Optim. 2025, 6, 47–70. [Google Scholar] [CrossRef]
Xia, Z.; Sun, A.; Xu, J.; Peng, Y.; Ma, R.; Cheng, M. Contemporary Recommendation Systems on Big Data and Their Applications: A Survey. IEEE Access 2024, 12, 196914–196928. [Google Scholar] [CrossRef]
Chen, F.; Zou, J.; Zhou, L.; Xu, Z.; Wu, Z. Improvements on Recommender System Based on Mathematical Principles. OALib 2023, 10, 1–9. [Google Scholar] [CrossRef]
Mohammed, A.S.; Meleshko, Y.; Balaji, B.S.; Serhii, S. Collaborative Filtering Method with the use of Production Rules. In Proceedings of the 2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE), Dubai, United Arab Emirates, 11–12 December 2019; pp. 387–391. [Google Scholar] [CrossRef]
Gupta, G.; Katarya, R. A Study of Recommender Systems Using Markov Decision Process. In Proceedings of the 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 14–15 June 2018; pp. 1279–1283. [Google Scholar] [CrossRef]
Wang, L.; Jin, D. A Time-Sensitive Graph Neural Network for Session-Based New Item Recommendation. Electronics 2024, 13, 223. [Google Scholar] [CrossRef]
Mikhav, V.; Semenov, S.; Meleshko, Y.; Yakymenko, M.; Shulika, Y. Constructing the mathematical model of a recommender system for decentralized peer-to-peer computer networks. East. Eur. J. Enterp. Technol. 2023, 4, 24–35. [Google Scholar] [CrossRef]
Liu, J.; Duan, L. A Survey on Knowledge Graph-Based Recommender Systems. In Proceedings of the 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 12–14 March 2021; pp. 2450–2453. [Google Scholar] [CrossRef]
Baig, D.; Nurbakova, D.; Mbaye, B.; Calabretto, S. Knowledge Graph-Based Recommendation System for Personalized E-Learning. In Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization (UMAP Adjunct ′24), Cagliari, Italy, 1–4 July 2024; Association for Computing Machinery: New York, NY, USA, 2024; pp. 561–566. [Google Scholar] [CrossRef]
Vultureanu-Albişi, A.; Murareţu, I.; Bădică, C. Explainable Recommender Systems Through Reinforcement Learning and Knowledge Distillation on Knowledge Graphs. Information 2025, 16, 282. [Google Scholar] [CrossRef]
Ruiqin Wang, Yunliang Jiang, Jungang Lou, TDR: Two-stage deep recommendation model based on mSDA and DNN. Expert Syst. Appl. 2020, 145, 113116. [CrossRef]
Adomavicius, G.; Tuzhilin, A. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 2005, 17, 734–749. [Google Scholar] [CrossRef]
Burke, R. Hybrid Recommender Systems: Survey and Experiments. User Model User-Adapt. Interact 2002, 12, 331–370. [Google Scholar] [CrossRef]
Ricci, F.; Rokach, L.; Shapira, B. Recommender Systems Handbook; Springer: Boston, MA, USA, 2010. [Google Scholar] [CrossRef]
Pawlicka, A.; Pawlicki, M.; Kozik, R.; Choraś, R.S. A Systematic Review of Recommender Systems and Their Applications in Cybersecurity. Sensors 2021, 21, 5248. [Google Scholar] [CrossRef] [PubMed]
Meleshko, Y.; Yakymenko, M.; Semenov, S. A Method of Detecting Bot Networks Based on Graph Clustering in the Recommendation System of Social Network. In Proceedings of the CEUR Workshop Proceedings, 5th International Conference on Computational Linguistics and Intelligent Systems, Lviv, Ukraine, 22–23 April 2021; Volume I, pp. 1249–1261. Available online: https://ceur-ws.org/Vol-2870/paper92.pdf (accessed on 3 July 2025).
Gockenbach, M.S. Partial Differential Equations: Analytical and Numerical Methods, 2nd ed.; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2011. [Google Scholar] [CrossRef]
Meleshko, Y.; Raskin, L.; Semenov, S.; Sira, O. Methodology of probabilistic analysis of state dynamics of multi dimensional semi Markov dynamic systems. East. Eur. J. Enterp. Technol. 2019, 6, 6–13. [Google Scholar] [CrossRef]
Kontiza, K.; Loboda, O.; Deladiennee, L.; Castagnos, S.; Naudet, Y. A Museum App to Trigger Users’ Reflection. In Proceedings of the International Workshop on Mobile Access to Cultural Heritage (MobileCH2018), Barcelona, Spain, 3 September 2018. ⟨hal-02471994⟩. [Google Scholar]
Yu, M.L.; Zhou, R.G.; Cai, Z.; Tan, C.-W.; Wang, H.W. Unravelling the relationship between response time and user experience in mobile applications. Internet Res. 2020, 30, 1353–1382. [Google Scholar] [CrossRef]

Figure 1. Block diagram of the user request reception and categorization stage.

Figure 2. Block diagram of the stage for assigning weights to user data categories.

Figure 3. Block diagram of the algorithm of the query execution and response generation phase.

Figure 4. GERT scheme of the first stage of user request processing.

Figure 5. GERT Scheme of the stage for assigning quantitative weights to identified categories.

Figure 6. GERT scheme of the final result generation stage.

Figure 7. Sensitivity of stage 2 output to parameter variation.

Figure 8. Probability density plots of execution time for the user request processing stages and the integrated model as a whole.

Figure 9. Distribution function graphs of execution time for user request processing and the integrated model as a whole.

Figure 10. Exceedance probability graphs of execution time for user request processing and the integrated model.

Table 1. Comparative summary of related work on recommendation systems.

Reference(s)	Focus Area	Key Contributions	Identified Limitations
[7,8,9,10,11]	General recommendation systems; museum and cultural applications	Survey of algorithm types; early personalization methods; domain-specific applications (e.g., museum tours, semantic collections)	Lack of formal workflow modeling; no probabilistic analysis; no timing structure
[12,13,14,15]	Data structuring and user validation	Approaches to input filtering, sentiment analysis, and semantic profiling	Input validation treated as preprocessing; no modeling of error recovery or iterative feedback
[16,17,18,19,20,21]	Modeling of data processing algorithms	Analytical models (e.g., Markov chains, decision processes, and graph methods); early GERT use in networks	Mostly high-level or conceptual; lack of detailed probabilistic-time modeling of user request flows
[22,23,24,25]	Intelligent recommendation system	Deep learning, knowledge graphs, semantic user models	Advanced architectures but missing explicit stage-wise logic and timing formalization

Table 2. The list of nodes and their formal descriptions.

Node	Stage	Purpose
1	Request reception	Initial entry point
2	Categorization	Preliminary analysis of the data structure
3	Category validation	Checking correctness of assigned category
4	Reformulation	Correction and return to previous stages in case of error
5	Weight assignment	Transition to the data prioritization stage
6	Stage completion	Transfer to the next processing module

Table 3. Characteristics of the industries considered in the GERT model, as well as the parameters of their distribution.

W-Function	Derivative of the Probability Function	Derivative of the Moment-Generating Function
W₁₂	p₁	$(\frac{λ_{1}}{λ_{1} - s})^{k}$
W₂₃	p₁	$(\frac{λ_{1}}{λ_{1} - s})^{k}$
W₃₅	p₂	$\frac{λ_{2}}{λ_{2} - s}$
W₅₆	p₃	$\frac{λ_{2}}{λ_{2} - s}$
W₃₄	p₄	$\frac{λ_{4}}{λ_{4} - s}$
W₄₂	p₅	$\frac{λ_{5}}{λ_{5} - s}$

Table 4. Description of the nodes and functions of the second stage GERT network (weight assignment module).

Node	Stage	Purpose
1	Receiving categorized data	Initial stage; loading results from the previous step
2	Assigning weights to data categories	Assigning quantitative importance values to each category
3	Validating assigned weights	Checking the correctness and validity of weights
4	Reassigning weights	Correcting weights if errors are detected
5	Preparing the final formalized query	Final step: creating a unified structure for further request processing

Table 5. Characteristics of the industries considered in the GERT model and distribution parameters.

W-Function	Derivative of the Probability Function	Derivative of the Moment-Generating Function
W₁₂	p₁	$(\frac{λ_{1}}{λ_{1} - s})^{k}$
W₂₃	p₂	$(\frac{λ_{2}}{λ_{2} - s})^{k}$
W₃₅	p₃	$\frac{λ_{3}}{λ_{3} - s}$
W₃₄	p₄	$\frac{λ_{4}}{λ_{4} - s}$
W₄₂	p₅	$\frac{λ_{5}}{λ_{5} - s}$
W₅₂	p₆	$\frac{λ_{6}}{λ_{6} - s}$

Table 6. Description of the nodes and functional roles in the final stage of request processing (result generation module).

Node	Stage	Purpose
1	Input data acquisition	Start of the stage, receiving results from the previous step
2	System analysis of weights	Interpretation and aggregation of category weights
3	Preliminary result generation	Calculating the response with weight consideration
4	Result verification	Validation against requirements and constraints
5	Finalization	Completion: transferring the result to the system output
6	Stage completion	Transfer to the next processing module

Table 7. Parameters of the W-functions and time distribution transforms for the final-stage GERT model.

W-Function	Derivative of the Probability Function	Derivative of the Moment-Generating Function
W₁₂	p₁	$(\frac{λ_{1}}{λ_{1} - s})^{k}$
W₂₃	p₂	$(\frac{λ_{2}}{λ_{2} - s})^{k}$
W₃₅	p₃	$\frac{λ_{3}}{λ_{3} - s}$
W₃₄	p₄	$\frac{λ_{4}}{λ_{4} - s}$
W₄₂	p₅	$\frac{λ_{5}}{λ_{5} - s}$
W₄₃	p₆	$\frac{λ_{6}}{λ_{6} - s}$

Table 8. Comparison of the GERT and Markov models.

Metric	GERT Model	Markov Model	Difference
Expected Time (E[T])	29.80	25.20	+18.3%
Standard Deviation	2.30	1.50	+53.3%
95% Quantile	33.52	28.50	+15.0%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Semenov, S.; Baran, W.; Andrzejewska, M.; Pochebut, M.; Petrovska, I.; Sitnikova, O.; Melnyk, M.; Mekhovykh, A. Mathematical Model of Data Processing in a Personalized Search Recommendation System for Digital Collections. Appl. Sci. 2025, 15, 7583. https://doi.org/10.3390/app15137583

AMA Style

Semenov S, Baran W, Andrzejewska M, Pochebut M, Petrovska I, Sitnikova O, Melnyk M, Mekhovykh A. Mathematical Model of Data Processing in a Personalized Search Recommendation System for Digital Collections. Applied Sciences. 2025; 15(13):7583. https://doi.org/10.3390/app15137583

Chicago/Turabian Style

Semenov, Serhii, Wojciech Baran, Magdalena Andrzejewska, Maxim Pochebut, Inna Petrovska, Oksana Sitnikova, Marharyta Melnyk, and Anastasiya Mekhovykh. 2025. "Mathematical Model of Data Processing in a Personalized Search Recommendation System for Digital Collections" Applied Sciences 15, no. 13: 7583. https://doi.org/10.3390/app15137583

APA Style

Semenov, S., Baran, W., Andrzejewska, M., Pochebut, M., Petrovska, I., Sitnikova, O., Melnyk, M., & Mekhovykh, A. (2025). Mathematical Model of Data Processing in a Personalized Search Recommendation System for Digital Collections. Applied Sciences, 15(13), 7583. https://doi.org/10.3390/app15137583

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Mathematical Model of Data Processing in a Personalized Search Recommendation System for Digital Collections

Abstract

1. Introduction

1.1. Motivation

1.2. State of the Art

1.3. Objectives and Contributions

2. Materials and Methods

2.1. Algorithmic Structure for Processing User Data in a Recommendation System

2.2. GERT Model for User Data Processing in the Recommendation System

2.2.1. GERT Model of the First Stage of User Request Processing

2.2.2. GERT Model of the Stage for Assigning Quantitative Weights to Identified Categories

2.2.3. GERT Model of the Final Result Generation Stage

3. Results

Numerical Analysis of the Time Characteristics of User Request Processing Stages and the Integrated Model

4. Discussion of Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI