Abstract
A methodology for the statistical modeling of boundary value problems of mathematical physics for parabolic equations used to describe transport processes in a layer with incomplete data at the boundary of a body has been developed and presented. The boundary value problem is formulated for the case of a non-zero initial condition, the presence of a stable source at one boundary of the body (classical boundary condition), and a sample of experimental data for the desired function at the other boundary (statistical boundary condition). A linear regression model obtained from experimental data by the least squares method is used as a boundary condition. The article defines two-sided statistical estimates of the solution of the boundary value problem through linear regression coefficients, analyzes the mathematical model taking into account the influence of the sample size and covariance, determines the reliable intervals for linear regression and the desired function depending on the given level of reliability. The influence of the experimental data statistical characteristics on the desired function at the lower layer’s boundary for different types of samples in the case of large or small-time intervals is studied. The two-sided critical domain is obtained and analyzed on the basis of Fisher’s criterion. The influence of the reliability level on the reliable intervals, the solution to the parabolic boundary value problem, and the width of the bilateral critical domain constructed for the solution is analyzed.
1. Introduction
Various tasks in engineering, ecology, biology, pharmacology, and other technical and technological research require the further development of approaches and methods for the mathematical description of non-equilibrium processes of a different physical nature [1]. The relevance of the above problems is caused by the need to develop effective methods and estimates for predicting the distribution of man-made pollution, assessing the quality of drinking water and improving its treatment on an industrial scale, determining the effect of diffusion of drugs into tissue, diffusion of aggressive substances when assessing the reliability and durability of structural elements and components to prevent the destruction of the relevant materials, etc. Therefore, statistical modeling methods that combine classical approaches to mathematical modeling and methods of mathematical statistics are relevant. These studies enable us to obtain a reliable forecast of the processes taking place in environmental objects, industrial equipment units, the human body, etc., and to take the necessary measures in time to prevent their negative development. The main approaches and methods of statistical modeling have been developing quite rapidly over the past few decades [2,3].
These methods are widely used in environmental problems. The complexity of the study is explained by the multiphase behavior of the diffusion process. Over recent decades, researchers have developed various statistical models of multiphase flow, mass transfer, and solute transport to simulate the distribution of groundwater contamination [4,5], to explore the concepts of electrochemical nucleation, and in [6,7] to analyze meteorological fields obtained to reflect the characteristics of air pollution. Statistical modeling methods are actively used in biology and medicine. In [8], a deterministic and statistical model describing synaptic communication in a biological molecular communication system was proposed and formulated for the diffusion equation in terms of a hypergeometric distribution. Özdural et al. [9] presented a mathematical statistical model based on nonequilibrium conditions that describes the dynamic adsorption of proteins. A data-driven methodology for economical modeling and the analysis of dynamic cerebral autoregulation of arterial blood pressure, cerebral blood flow velocity, and their time derivatives was developed [10]. The dynamics of diffusion processes were analyzed by examining the eigenvalues of the Markov random walk matrix in the data set. Blood propagation, blood flow dynamics, and diffusion were modeled in [11] using the Monte Carlo method and the laminar Navier–Stokes equation. Computational statistical models help to eliminate the risks of excessive blood heating and improve the design of blood flow devices. Models of the diffusion release of drugs from hydrogel ophthalmic lenses were analyzed using probabilistic modeling methods in [12].
The statistical modeling technique has shown particular efficiency in engineering problems. The aim of paper [13] was to model heat and mass transfer in electronics problems, whose mathematical models are based on the perturbed circuit equation. Using the precise numerical statistical modeling of flow and transfer, Gouze et al. [14] studied the increase in the scale of passive dissolved matter transfer in a carbonate rock sample characterized by microporous areas. Statistical modeling was used in [15] to study the transfer of dissolved oxygen with a high Schmidt number from oscillating turbulent flows to a permeable layer of microbial sediment. The mathematical model of the kinetics of neutral particles diluted in a gas or plasma and reacting with a catalytic surface is formulated as a boundary value problem for the reaction–diffusion equation. However, this approximation does not always adequately reflect the transfer process. As an alternative, Longo et al. [16] used Monte Carlo simulation, taking into account the problem of statistical error. The aim of the study [17] is to develop probabilistic analytical models for predicting the characteristics of concrete in terms of chloride penetration. Gusev and Nikolaev [18] proposed a method for assessing the thermal state of insulating cellular panels, which is described by a boundary value problem for a parabolic equation with a discontinuous diffusion coefficient and boundary conditions of the third kind. The problem is solved by means of a probabilistic representation of the functional expectation of the diffusion process corresponding to the boundary value problem. The possibility of the non-invasive determinationof the diffusion approximation parameters in the radiation transfer theory under the condition of fixing the input radiation power was researched by numerical and statistical modeling in [19]. De Lemos and Mesquita [20] presented a numerical simulation of the combustion of an air/methane mixture in porous materials using a model that accounts for intra-pore levels of turbulent kinetic energy. Statistical tests and corresponding mathematical modeling were used in [21] to determine the effect of temperature, moisture content, density, and porosity of the material on the effective moisture diffusion coefficient during the convective drying of celery root. A statistical approach for modeling small molecules at non-zero concentrations in microporous materials was developed in [22].
It should be noted separately that statistical modeling in recent years has received its further development on the basis of approaches based on various nonlinear regression models [23]. The study [24] proposes a new method for calculating nonlinear heat conduction processes under Robin boundary conditions. The article [25] develops and implements an effective numerical scheme for studying solutions to the nonlinear diffusion mass transfer problem.
In view of the need to solve the problems mentioned above and a number of other similar problems, a certain mathematical apparatus for statistical modeling of reaction–diffusion problems has been developed. A systematic presentation of methods and approaches to statistical modeling of random diffusion fluxes of impurity particles in a two-phase stratified strip with a stochastic arrangement of phases and random thickness of inclusion layers is given in [26]. Similar diffusion models are important in the design of composite layered materials [27], the study of filter throughput properties [28], the prediction of the spread of pollutants in the environment, etc. The calculation formulas for the diffusion flux averaged over an ensemble of phase configurations and over the thickness of the inclusion were obtained.
The analysis of the problem has shown that it is not always possible to correctly impose boundary conditions on the boundaries based on physical considerations, even in a fairly general form. This is due to the complexity and insufficiency of the relevant research, and therefore, the analysis and necessary generalizations are lacking. Following the above, a mathematical model of the transfer process in the layer under the condition of the given experimental data on a part of the layer boundary was developed and studied in [29]. From the experimental data, a boundary condition is constructed, the corresponding mixed problem is formulated and solved, the influence of the statistical characteristics of the sample of experimental data on the solution is analyzed, and a two-sided statistical estimate of the solution is determined. The reliable intervals for the coefficients of the regression equation and the corresponding reliable intervals for the required function are established in the case of different samples in terms of size and variance.
This paper considers a parabolic boundary value problem that describes the processes of heat, mass, charge, etc., transfer in a layer when experimental data on the required function are available at one of the boundaries under a non-zero initial condition.The main focus of the research in this article is to expand and generalize the range of possible statistical modeling approaches to studying the linear transport models, presented in [26,27,28,29]. In particular, developing the ideas and approaches of the paper [29], we present an effective method for establishing reliable intervals for solutions of parabolic differential equations with experimental data at the boundary of the body and analyze the influence of the reliability level on the width of reliable intervals. Developing and generalizing the methodology of the above-mentioned papers, a two-sided critical domain for the solution is constructed. The aim of the article is to conduct a numerical analysis of the solution of the boundary value problem depending on the statistical characteristics of the sample. The article also investigates the influence of the reliablelevel on the reliable intervals for the linear regression and for the solution of the parabolic boundary value problem.
2. Formulation of a Boundary Value Problem for a Parabolic Equation in a Layer with Experimental Data at the Boundary
In the layer with the thickness , the process described by the function , takes place, being a solution of the second-order partial differential equation [30]
where is a constant coefficient, is time, and is a spatial coordinate.
We assume that at the initial moment the function
and at the upper surface of the layer is subjected to a stable source
At the lower boundary of the layer, the values of the function were obtained experimentally at time points (Table 1).
Table 1.
Experimental data at the lower boundary of the layer.
Then a linear regression model is built [31,32], searching for its coefficients using the least squares method [33]
The coefficients and are calculated using the following formulas
where . As a result, the boundary condition at will take the form
3. Solving a Parabolic Boundary Value Problem Under the Condition of Linear Regression on the Layer Boundary
To solve the problem (1)–(3), (7) by way of substitution
one can reduce it to a problem with zero boundary conditions with respect to the function
The required function satisfies the initial condition
and zero boundary conditions
Applying to the boundary value problem (9)–(11) the finite integral Fourier sin-transform [34] (, ), the next problem is obtained in images
The solution to problems (12) and (13) is the following expression [35]
After the Fourier sin-transform [36] is applied to the relationship (14), one can obtain
Taking into account (8), the solution to the problem (1)–(3), (7) is obtained in the form
Formula (15) can be specified taking into account that it is a linear regression, i.e., Formula (4) is valid. Then
and, accordingly, the solution becomes
Since the sample of the experimental data for the required function is given in the interval , the resulting formula (16) is valid for this time period.
4. Two-Sided Statistical Estimation for the Solution of a Boundary Value Problem
To determine the two-sided estimate of the solution to the problem (1)–(3), (7), represented using linear regression on the layer boundary , let us denote
In view of Formula (16), one can obtain
It should be noted that for exponent and the following estimation is true
Hence, for the positive function (for example, if ; ), the next estimation can be obtained
For the negative function (in particular, in the case of ; ), the following inequality is true
Summing the series included in the last two inequalities [37], we obtain for
and in case of
Then, from inequalities (19)–(21), taking into account the relation (17), one can obtain a two-sided estimate for the required function :
- -
- for
- -
- for
Substituting the expressions of the linear regression coefficients in the form of (5) and (6) into inequalities (22) and (23), we obtain
- -
- for
- -
- for
Let us study the asymptotic case at . Using the asymptotic relationship [29]
the following asymptotic inequalities are written
- -
- for
- -
- for
Note that the asymptotic inequalities (26) and (27) imply that the change in sample size affects the bounds of the two-sided estimates (24) and (25) for the function . In the case of experimental values of the same sign as the sample size increases, both boundaries of the functional intervals for (24) and (25) increase for , , i.e., the bilateral estimate of the solution shifts towards higher values . If one expresses the linear regression coefficients in terms of covariance and variance, then the two-sided estimation of the function will be in the form of
- -
- for
- -
- for
is the covariance of values and [38]; and is the variance of the variable . Note that in the case of , an increase in the covariance of values and leads to a shift in the bilateral evaluation for upwards, and the interval itself increases. At the same time, the increase in the dispersion of values shifts the value downwards , and the width of the bilateral evaluation narrows. For the increase of leads to a downward shift in the bilateral estimate, but the interval itself increases. The increase of shifts upwards the values , and the width of the bilateral evaluation narrows.
Note that the linear regression coefficients can also be presented using the correlation coefficient [29]. The two-sided estimation of the solution to the initial boundary value problem will then take the form:
- -
- for
- -
- for
where , and are standard deviations of values and , respectively. The obtained bilateral estimate implies that an increase in the correlation coefficient leads to an increase in the bilateral estimation interval for and for to shift the interval upwards, and for to shift the interval downwards.
5. Reliable Intervals for Solving a Boundary Value Problem
To find the limits of the reliable intervals for the function let us represent it in the form
where in the case of linear regression [39]
The lower and upper bounds of the reliable regression zone are obtained by connecting the points with coordinates and , respectively. We would like to note that is the solution of the equation
where is the number of freedom degrees, ; is the Gamma function [36]. And “+” in Formula (28) refers to the upper limit of the critical domain, and “−” refers to the lower limit of the critical domain.
Relationship (30), whereas is a natural number, can be represented as
where , [33]. From formula (31), it follows that assample size increases, the values of the subintegral function decrease, so in order for the area under the subintegral function to remain constant (namely ), the integration interval must increase, and, accordingly, the value of must decrease. However, the value increases as the level of reliability increases. Also, the width of the reliable interval for the function increases with increasing , since the reliability level is and with increasing deviation increases. By substituting expression (28) into relation (15), the following is obtained:
Similarly to (31), the solution of the boundary value problem (1)–(3), (7) is represented in the form
Taking into consideration the structure of expression (32), we have
Substituting the expression by Formula (29) into Formula (33), one can obtain
After integration, the half-width of the reliable interval for solving the original boundary value problem (1)–(3), (7) is obtained
Note that the function is directly proportional to the value of Student’s t-test and standard deviation . Notably, at , and at ; therefore , if , then .
6. The Two-Sided Critical Domain for the Predicted Value of the Built Linear Regression Model
To set a two-sided critical domain for function as the solution to the boundary value problem (1)–(3), (7), first, using the method given in [32], we find the two-sided critical domain for the linear regression (4). According to the given level of significance , the reliability level is determined. According to the table of experimental data (Table 1), the matrix of regressors can be written as
For the predicted value of the linear regression , variance is determined by the formula
where . The expected bilateral critical domain for is obtained by substituting the estimate of the standard deviation (34) into . Then the true value of the required function on the lower boundary of the layer with probability varies within the range of [32]:
The determinant of the next information matrix
is obtained in the form
Then the error matrix is
The lengths of the bilateral critical domains will be different at different points of the factor space, since the vectors are different in these points, i.e., the accuracy of the predicted value of the linear regression may vary at different time moments .
Let us find the boundaries of the two-sided critical domain for the function .
Suppose
where in the case of linear regression, which follows from Formula (35),
So
“+” refers to the upper limit of the critical domain in Formula (36), and “−” refers to the lower limit of the critical domain.
Having studied the influence of the on the coefficients , (38), note that as the number of measurements increases, these coefficients decrease, moreover
Then increasing the number of experiments in the same interval at a constant value leads to a narrowing of the bilateral critical domain (35), where with probability experimental data are likely to fall. At the same time an increase in the standard deviation of , with the same number of measurements leads to an increase in the two-sided critical domain for the linear regression function . The influence of the variance of the variable on the bilateral critical domain is similar to the influence of ; however, growth by/into the same number slows down the growth rate of this interval. Since the coefficient is directly proportional to the value of Student’s t-test , then with an increase in the reliability level , the value of the coefficient increases.
7. Finding a Two-Sided Critical Domain for the Solution of a Boundary Value Problem
Substituting the representation of the function (36) into relationship (15), the boundaries of the two-sided critical domain for solving the boundary value problem (1)–(3), (7) are obtained
In the case of imposing a linear regression as a condition on the lower boundary of the layer, taking into account Formula (16), Formula (39) can be represented as
Here . Undefined integrals and are not solvable in elementary functions, so we will find their approximate expressions. Since function is continuously differentiable the required number of times, we will expand it into a Taylor series over the interval of the given experimental measurements in the neighborhood of point from this interval:
Assume . For the Taylor series to converge, the following condition must be ensured or [37]. Limiting ourselves to the first three terms of the Taylor series, let us analyze the function , calculated by Formula (41) depending on the value of .
The total residual term of the series in this case is calculated by the formula
Table 2 shows the exact values of the function and the values of its expansion in the Taylor series (42) with a limit of three terms for three levels of reliability, calculated for 0.001, 0.003 in the interval . Figure 1 demonstrates their corresponding total residual terms , calculated by (37), where is the middle of the interval . Curves 1–3 correspond to the values , 0.95, 0.99 for (Figure 1a) and (Figure 1b). Since varying the parameters within a wide range gives a similar result (Figure 1); therefore, we assume that the restriction to three terms of the series (41) sufficiently approximates the function (37).
Table 2.
Exact and approximate values of the function for different values .
Figure 1.
Graphs of the total residual terms of the Taylor series (42) for different reliability levels at (a) and (b).
In case , where , and , the following is true
Then, for the bilateral critical domain, the following bounds are obtained
Consider that . Then, for long-term processes described by the original boundary value problem, the greater the , the greater the width of the bilateral critical domain, i.e., functions have no stationary mode (time asymptotic). The width of the bilateral critical domain is most influenced by the coefficient . In particular, the smaller the value of , the narrower the two-sided critical domain for the same level of reliability . The larger the variance of the predicted value of the linear regression , the wider the two-sided critical domain.
8. Influence of Statistical Characteristics of a Sample on Reliable Intervals and Bilateral Critical Domain of a Solution to a Boundary Value Problem
Let us analyze the influence of the statistical characteristics of experimental data on the required function at the lower boundary of the layer using concrete examples. Samples of experimental data were obtained for both the uniform and non-uniform distribution of the study time interval [39]. The basic parameters are , , , and . The series is calculated with an accuracy of . Let us consider the cases of samples of large and small volumes characterized by large or small variance.
Figures 2, 4, 6, 8, 10, and 12 demonstrate linear regression for the relevant sample (Figures 2a, 4a, 6a, 8a, 10a, and 12a) and solutions to the boundary value problem (1)–(3), (7) (Figures 2b, 4b, 6b, 8b, 10b, and 12b). In Figures 2a, 4a, 6a, 8a, 10a, and 12a, solid lines (curves 1) indicate the function , the dashed and dotted lines indicate its reliable intervals (curves 2−, 3−, 4− and 2+, 3+, 4+), and the green dots indicate the experimental data presented in the corresponding table. Figures 2b, 4b, 6b, and 10b show solutions to the initial boundary value problem (solid lines) for long time intervals at time moments 0.1, 0.5, 2 (curves 1–3). Figures 8b and 12b show the same solutions for short time intervals at time moments 0.1, 0.5, 1 (curves 1–3). The curves (dashed lines) with the index “+” are calculated for the upper limits of the reliable intervals , and with the index “−” for the lower ones . The lower indices 1, 2, 3 correspond to the values of the reliability level , 0.95, 0.99. Figures 3, 5, 7, 9, 11 and 13 show solutions to the initial boundary value problem , that are normalized to the value of the function on the upper boundary of the layer , and the corresponding bilateral critical domains calculated by the Fisher criterion, for 0.1 (Figures 3a, 5a, 7a, 9a, 11a and 13a) and 0.5 (Figures 3b, 5b, 7b, 9b, 11b and 13b). Curves 2−, 2+ are calculated for , curves 3−, 3+ are calculated for , and curves 4−, 4+ are calculated for .
Tables 4, 6, 8, 10, 12 and 14 contain the maximum widths of the two-sided critical domain of the boundary value problem (1)–(3), (7) for different time moments and , 0.95, 0.99 for six samples.
8.1. Sample I: Large Sample, Long Time Interval, Large Variance
Sample of experimental data with volume is shown in Table 3.
Table 3.
Experimental data with a large variance over a long time interval for .
The sample range is = [0, 2.9716], and the sample variance is = 0.893931523259907. Let us build a regression based on the data in Table 3 using the least squares method. Based on the view of the correlation field, we assume that the time dependence of the required function on the lower boundary of the body is linear. According to the sample data, the coefficients of the linear regression (4) are = 0.835387001287001 and = 0.16861996996997. Now, let us find reliable intervals with reliability for the linear regression coefficients. For the coefficients and reliable intervals with reliability are calculated within reliable limits (0.792324843411; 0.878449159163); (0.080983805814; 0.256256134126); with reliability are calculated within reliable limits (0.783632562191029; 0.887141440382974); (0.063294070291443; 0.273945869648497); with reliability are calculated within reliable limits (0.765903964133865; 0.904870038440137); and (0.0272144494313441; 0.310025490508596). For short times of the transfer process described by the boundary value problem (1)–(3), (7), the function monotonically decreases (curve 1, Figure 2b). As increasing values of the function also increase, the maximum is formed at the point (curve 2, Figure 2b). This maximum eventually shifts to the lower boundary of the body, for example . Afterwards, the behavior of the function but its values in the entire body area increase significantly (curve 3, Figure 2b). Boundaries of the bilateral critical domain , calculated by Formula (43), for this sample and all others are given in Appendix A.
Figure 2.
Linear regression (a) and solutions to the boundary value problem at different moments of time (b) and their reliable intervals.
As the transfer time increases, not only do the values increase, but the widths of the reliable intervals also increase (Figure 2b). Thus, with an increase in value from 0.5 to 2, the maximum widths of the reliable intervals increase more than double, regardless of the value of . For example, . At the same time, increasing the level of reliability from 0.9 to 0.99 leads to the expansion of the reliable interval to 61%. In particular, .
In contrast to the reliable intervals, the bilateral critical domain of the problem solution initially decreases (Figure 3), reaching its narrowest values in the middle of the time interval ; further, the bilateral critical domain expands, symmetrically to the narrowing in the initial time interval (Table 4). Thus, for short times, the width of this domain decreases to 17%, namely 0.833 for (Figure 3a,b); for medium times, the difference between and decreases by 16.6%. The largest widths of the bilateral critical domain at increasing increase by 60%. For example, .
Figure 3.
Solutions to the boundary value problem and corresponding bilateral critical domains for (a) and (b).
Table 4.
Maximum width of the two-sided critical solution domain for sample I.
8.2. Sample II: Large Sample, Long Time Interval, Small Variance
Sample of experimental data with volume is shown in Table 5.
Table 5.
Experimental data with low variance for a long time interval.
The sample range is [0. 0.694], and sample variance is = 0.174264028380218.
For the regression coefficients = 0.155924066924067 and = 0.218493993993994, reliable intervals with reliability are calculated within the following reliable limits (0.139918763614950; 0.171929370233184); (0.185921464949008; and 0.251066523038980); with reliability are calculated within the following reliable limits (0.136688; 0.17516); and (0.179346560289016; 0.257641427698971); with reliability are calculated within the following reliable limits (0.130098675459337; 0.181749458388797); and (0.165936519928265; 0.271051468059722).
As the transfer process increases in time, the function value increases in the entire body area (Figure 4b). For short and mid-long times is a monotonically decreasing function. The local maximum of the function begins to form at moment 0.6 in the point . With the increase in this maximum grows, shifting to the lower boundary of the layer () and becomes global (Figure 4b). With the increase in time, the width of the reliable interval grows (Figure 4b), similarly for different values . For example, .
Figure 4.
Linear regression (a) and solutions to the boundary value problem at different time moments (b) and their reliable intervals.
The two-sided critical domain of the sample is also initially narrowed (Figure 5), and then, in the second half of the time interval is extended (Table 6). The width of this domain for short times decreases to 17%, namely 0.833 for (Figure 5); for long times, the difference between and decreases by 16.6%. There is also symmetry in the widths in time intervals and . The largest widths of the bilateral critical domain at increasing increase by 60% too. For example, . For samples I and II, the growth rate of the reliable interval and reduction in the bilateral critical domain , and then the corresponding growth is the same.
Figure 5.
Solutions to the boundary value problem and corresponding bilateral critical domains for (a) and (b).
Table 6.
Maximum width of the two-sided critical solution domain for sample II.
8.3. Sample III: Small Sample, Long Time Interval, Large Variance
Sample of experimental data with volume is shown in Table 7.
Table 7.
Experimental data with a large variance over a long time interval for .
The sample range is [0, 4.4371], and sample variance is = 1.47815809975753. For regression coefficients
1.361503496503 and = −0.158314102564102 reliable intervals with reliability are calculated within the following reliable limits (1.2942027604045400; 1.4288042326024500); and (−0.289421049582769; −0.0272071555454357); with reliability are calculated within the following reliable limits (1.27876771583259; 1.4442392771744); and (−0.319489689069116; 0.00286148394091157); with reliability are calculated within the following reliable limits (1.24382131097189; 1.47918568203511); and (−0.387567940751664; 0.0709397356234593). For the experimental data on the values of the required function at the lower boundary of the layer, presented in Table 7, also for short transfer times, the function monotonically decreases (Curve 1, Figure 6b). As increasing values of the function increase and the maximum is formed at the point (Curve 2, Figure 6b), that grows over time and shifts to the lower boundary of the layer. In particular, at the maximum value of the function is reached at the point . For long times, the values of the function increase significantly in the entire body domain (Curve 3, Figure 6b). With the prolongation of time, the width of the reliable interval increases (Figure 6b), in the same way for different values .
Figure 6.
Linear regression (a) and solutions to the boundary value problem at different moments of time (b) and their reliable intervals.
For example,
The bilateral critical domain of the problem solution also reduces over time (Figure 7) and then expands (Table 8) for this sample. The width of this domain for short times decreases to 17%, namely for (Figure 7); for long times, the difference between and decreases by 10.6%. The larger widths of the bilateral critical domain with increasing increase by 75%. For example, . Symmetry in relation to the widths of time intervals and is also present.
Figure 7.
Solutions to the boundary value problem and the corresponding bilateral critical domains for (a) and (b).
Table 8.
Maximum width of the two-sided critical solution domain for sample III.
8.4. Sample IV: Small Sample, Short Time Interval, Large Variance
Let us now consider a sample of experimental data with volume over a short time interval, as Table 9 shows.
Table 9.
Experimental data with a large variance over a short time interval.
Sample range [0, 5.9291], and sample variance = 2.18380650788105. For the regression coefficients 5.9880034965035 and = −0.2559269230769220 reliable intervals with reliability are calculated within the following reliable limits (5.4663064488105100; 6.5097005441964900); and (−0.5946954178281560; 0.0828415716743118); with reliability are calculated within the following reliable limits (5.34665817078342; 6.62934882222358); and (−0.672390063074964; 0.16053621692112); with reliability are calculated within the following reliable limits (5.07576312118831; 6.90024387181869); and (−0.848298108927857; 0.336444262774013).
For the experimental data on the values of the required function at the lower boundary of the layer, presented in Table 9, the formation of a local maximum of the function already begins for short times. This forms in the middle of the body, grows over time, and shifts to the lower border of the layer. For example, and 0.635718339 (Curve 1, Figure 8b), and 4.4428612 (Curve 2, Figure 8b), and 9.007544363 (Curve 3, Figure 8b). The reliable interval width with the prolongation of the process time increases (Figure 8b), and is the same for the different values . For example, 1.93. For this sample, the bilateral critical domain of the solution also narrows with time (Figure 9), and then enlarges symmetrically on the intervals and (Table 10). The width of this domain for short times reduces to 25%, namely = 0.75 for (Figure 9); for medium times, the difference between = 0.5 and increases by 61.5%. The largest widths of the bilateral critical domain at increasing also increase by 75%. For example,
Figure 8.
Linear regression (a) and solutions to the boundary value problem at different moments in time (b) and their reliable intervals.
Figure 9.
Solutions to the boundary value problem and the corresponding bilateral critical domains for (a) and (b).
Table 10.
Maximum width of the two-sided critical solution domain for sample IV.
8.5. Sample V: Small Sample, Long Time Interval, Small Variance
Sample of experimental data with volume is obtained over a long time interval.
Sample range [0, 0.794], and sample variance = 0.223456914022016. For the regression coefficients = 0.180180811808118 and = 0.26010024600246 reliable intervals with reliability are calculated within the following reliable limits (0.128690059337185; 0.231671564279051); and (0.154680344031757; 0.365520147973162); with reliability are calculated within the following reliable limits (0.116880945418082; 0.243480678198154); and (0.130502883575815; 0.389697608429105); with reliability are calculated within the following reliable limits (0.0901439914598696; 0.270217632156367); and (0.0757628199719453; 0.444437672032974). For the experimental data on the values of the required function at the lower boundary of the layer, presented in Table 11, for short times, the function monotonically decreases (Curve 1, Figure 10b). The local maximum of the function is formed at the point (Curve 2, Figure 10b) and, as in the previous cases, increases with time and shifts to the lower boundary of the layer (Curve 3, Figure 10b). For example, and (Curve 3, Figure 10b).
Table 11.
Experimental data with low variance over a long time interval.
Figure 10.
Linear regression (a) and solutions to the boundary value problem at different moments in time (b) and their reliable intervals.
Reliable interval width with the prolongation of the process time increases (Figure 10b); moreover, this occurs in the same manner for the different values . For example,
For this sample, the bilateral critical domain of the solution of the problem also narrows with time (Figure 11), and then widens symmetrically in the intervals and (Table 12). The width of this domain for short times decreases to 17%, namely, for (Figure 11);for average times, the difference between and decreases by 16.07%.
Figure 11.
Solutions to the boundary value problem and the corresponding bilateral critical domains for (a) and (b).
Table 12.
Maximum width of the two-sided critical solution domain for sample V.
Larger widths of the bilateral critical domain with increasing increase by 75%. For example,
8.6. Sample VI: Small Sample, Short Time Interval, Small Variance
Sample of experimental data with volume is obtained over a short time interval in the form of Table 13.
Table 13.
Experimental data with low variance over a short time interval.
Sample range = [0, 0.02], and sample variance = 0.000044820833333. For the regression coefficients = 0.0181030150250685 and = 0.00291602180076567 reliable intervals with reliability are calculated within the following reliable limits (0.014367706354996; 0.02183832369514140); and (0.000667626599415; 0.00516441700211638); with reliability are calculated within the following reliable limits (0.0135110343606834; 0.0226949956894535); and (0.0001519698301164; 0.0056800737714150); with reliability are calculated within the following reliable limits (0.01157144771378340; 0.02463458233635350); and (−0.00101552600347659; 0.00684756960500794). For the experimental data on the values of the required function at the lower boundary of the layer, presented in Table 13, for the entire time interval the function monotonically decreases (Figure 12b). And time is already a stationary one . As the time approaches to the growth rate of function significantly slows down (Figure 12b). The reliable interval width for this sample is substantially lower than for the previously discussed ones. With prolongation of the process time, the reliable interval width increases slightly (Figure 12b). For example, 1.43. For this sample, the bilateral critical domain of the solution is much narrower than in the cases of other samples. But also, for sample VI, the width of the bilateral critical domain also narrows with time (Figure 13) and then widens symmetrically at the intervals and (Table 14). The width of this domain for short times decreases to 25%, namely 0.746 for (Figure 13); for long times, the difference between and increases by 34%. The largest widths of the bilateral critical domain at increasing increase by 73%. For example, . For the considered samples for all time moments the reliable interval and bilateral critical domain for the function are symmetrical, i.e., and for , .
Figure 12.
Linear regression (a) and solutions to the boundary value problem at different moments in time (b) and their reliable intervals.
Figure 13.
Solutions to the boundary value problem and the corresponding bilateral critical domains for (a) and (b).
Table 14.
Maximum width of the two-sided critical solution domain for sample VI.
For all types of samples, the following can be observed: the lower the values of the reliability level , the smaller the reliable intervals for both the linear regression , and the solution of the boundary value problem (1)–(3), (7) (Figure 2, Figure 4, Figure 6, Figure 8, Figure 10 and Figure 12). The same is true for : the higher the level of reliability, the wider the bilateral critical domain (Figure 3, Figure 5, Figure 7, Figure 9, Figure 11 and Figure 13). Point with the largest width of the reliable interval for the function is the same for all six samples regardless of the level of reliability , but may differ at different moments in time. So, 0.8 for , 0.725 for , and (Figure 2b, Figure 4b, Figure 6b and Figure 10b). At the same time, the largest width of the bilateral critical domain is observed at the lower boundary of the layer, i.e., 1, and is constant over the entire time period of the process under study (Figure 3, Figure 5, Figure 7, Figure 9, Figure 11 and Figure 13). Point , where the widths of the reliable interval and the bilateral critical domain are equal to zero , is located at the upper boundary of the layer, since the value of the required function at this boundary is known and constant in time.
9. Conclusions
The presented study proposes a methodology for studying physical, chemical, or biological processes, that can be described by parabolic boundary value problems with incomplete data at the body boundary. A specific boundary value problem is formulated for a layer with a non-zero initial condition, the action of a steady source on one boundary of the body, and a sample of experimental data on the required function on the other. Based on the experimental data, a linear regression model is built using the least squares method, being considered as a boundary condition. The solution to the boundary value problem is found by applying the finite integral Fourier transform. It is obtained for a general regression model containing integral terms of the regression function and is specified for the case of linear regression based on experimental data at the lower boundary of the layer.
A two-sided statistical estimate of the solution to the boundary value problem is determined through the coefficients of linear regression, which is analyzed in relation to the influence of sample size and covariance. The corresponding reliable intervals for the linear regression and the required function are determined on the basis of the obtained solution to the problem with a given level of reliability.
It is shown that the higher the absolute values of the required function, the greater the width of the reliable interval. For short times, the influence of the reliable interval for the slope of the regression equation is imperceptible, while the influence of the reliable interval for the free term of the regression is many times greater, especially in the vicinity of the lower boundary of the layer. It is also noted that the smaller the variance of the sample of experimental data, the smaller the width of the reliable interval for solving the formulated boundary value problem.
The formula for determining the bilateral critical domain based on the Fisher criterion is obtained and analyzed. The influence of the statistical characteristics of the sample of experimental data on the required function on the lower boundary of the layer is studied on specific examples. The cases of samples with large and small volumes, characterized by large or small variance, at large or small time intervals are considered.
The numerical analysis of the solution to the boundary value problem depending on the statistical characteristics of the sample is carried out. It is established that a larger variance of the time variable leads to a decrease in the solution values throughout the entire domain of the body, whereas a higher variance of the responses results in an increase in the values of the desired function. An increase in the correlation coefficient also leads to an increase in the values of the solution and the formation of its local or global maximum in the lower half of the layer. The influence of the level of reliability on the reliable intervals for linear regression and for solving a parabolic boundary value problem, as well as for a two-sided critical domain constructed to solve the problem, is studied. It is shown that for lower values of the reliability level, the reliable intervals are narrower for all types of samples, and are also narrower for the corresponding bilateral critical domains.
The developed methodology can be generalized to a wider class of inverse and direct boundary value problems for parabolic partial differential equations, including those with other types of boundary conditions (e.g., Neumann or Robin conditions), variable coefficients, or time-dependent sources. Moreover, the approach can be adapted for models incorporating nonlinear regression or experimental data obtained not only at boundaries but also within the domain. The integration of probabilistic modeling and statistical analysis holds potential for studying transport processes in more complex systems, such as multilayer or randomly inhomogeneous media, where only partial or uncertain data are available. Future research may focus on these generalizations to enhance the applicability of the proposed method.
Another promising direction is the refinement of the statistical modeling of transport processes using more complex regression structures and the incorporation of stochastic effects, especially in systems with multiphase or spatially heterogeneous properties and incomplete boundary or initial data.
These directions open new possibilities for integrating statistical regression with physical modeling in the analysis of transport phenomena under uncertainty.
The results were obtained as part of a grant from the Ministry of Education and Science of Ukraine (project number 0123U101691).
Author Contributions
Conceptualization, O.C. and H.B.; methodology, O.C. and P.P.; software, Y.B.; validation, O.C., H.B., Y.B. and M.V.; formal analysis, H.B.; investigation, O.C., H.B. and Y.B.; resources, M.V.; writing—original draft preparation, O.C. and H.B.; writing—review and editing, P.P. and M.V.; visualization, Y.B.; supervision, O.C. and P.P.; project administration, P.P.; funding acquisition, M.V. and P.P. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Data Availability Statement
The original contributions presented in the study are included in the article.
Conflicts of Interest
The authors declare no conflicts of interest.
Appendix A
Boundaries of the bilateral critical domain , calculated by Formula (43), are specified as follows.
For Sample I, the lower limits are as follows:
- -
- for
- -
- for
- -
- for
For Sample I, the upper limits are as follows:
- -
- for
- -
- for
- -
- for
For Sample II, the lower limits are as follows:
- -
- for
- -
- for
- -
- for
For Sample II, the upper limits are as follows:
- -
- for
- -
- for
- -
- for
For Sample III, the lower limits are as follows:
- -
- for
- -
- for
- -
- for
For Sample III, the upper limits are as follows:
- -
- for
- -
- for
- -
- for
For Sample IV, the lower limits are as follows:
- -
- for
- -
- for
- -
- for
For Sample IV, the upper limits are as follows:
- -
- for
- -
- for
- -
- for
For Sample V, the lower limits are as follows:
- -
- for
- -
- for
- -
- for
For Sample V, the upper limits are as follows:
- -
- for
- -
- for
- -
- for
For Sample VI, the lower limits are as follows:
- -
- for
- -
- for
- -
- for
For Sample VI, the upper limits are as follows:
- -
- for
- -
- for
- -
- for
References
- Moffatt, J.; Scarf, P. Sequential regression measurement error models with application. Stat. Model. 2016, 16, 454–476. [Google Scholar] [CrossRef]
- Sikaroudi, A.E.; Park, C. A mixture of linear-linear regression models for a linear-circular regression. Stat. Model. 2019, 21, 220–243. [Google Scholar] [CrossRef]
- Wilcox, R.R. Linear regression: Robust heteroscedastic confidence bands that have some specified simultaneous probability coverage. J.Appl. Stat. 2017, 44, 2564–2574. [Google Scholar] [CrossRef]
- Ju, M.; Li, X.; Wu, R.; Xu, Z.; Yin, H. Research Hotspots and Trend Analysis in Modeling Groundwater Dense Nonaqueous Phase Liquid Contamination Based on Bibliometrics. Water 2024, 16, 2840. [Google Scholar] [CrossRef]
- Gamburg, Y. Some novel efforts to describe the nucleation and growth at electrodeposition. J. Solid State Electrochem. 2013, 17, 353–359. [Google Scholar] [CrossRef]
- Udina, M.; Rosa Soler, M.; Arasa, R. Effects of nocturnal thermal circulation and boundary layer structure on pollutant dispersion in complex terrain areas: A case stud. Int. J. Environ. Pollut. 2012, 48, 47–59. [Google Scholar] [CrossRef]
- Milionis, A.E.; Davies, T.D. Regression and stochastic models for air pollution-I. Review, comments and suggestions. Atmos. Environ. 1994, 28, 2801–2810. [Google Scholar] [CrossRef]
- Lotter, S.; Schafer, M.; Zeitler, J.; Schober, R. Saturating receiver and receptor competition in synaptic DMC: Deterministic and statistical signal models. IEEETrans. Nanobioscience 2021, 20, 464–479. [Google Scholar] [CrossRef] [PubMed]
- Özdural, A.R.; Alkan, A.; Kerkhof, P.J.A.M. Modeling chromatographic columns: Non-equilibrium packed-bed adsorption with non-linear adsorption isotherms. J. Chromatogr. A 2004, 1041, 77–85. [Google Scholar] [CrossRef]
- dos Santos, K.R.M.; Katsidoniotaki, M.I.; Miller, E.C.; Petersen, N.H.; Marshall, R.S.; Kougioumtzoglou, I.A. Reduced-order modeling and analysis of dynamic cerebral autoregulation via diffusion maps. Physiol. Meas. 2023, 44, 044001. [Google Scholar] [CrossRef] [PubMed]
- Goulopoulos, A.; Etim, E.; Korupolu, S.; Farinelli, W.; Sierra, H.; Anderson, R.R.; Fischbach, A.; Franco, W. Optical, flow, and thermal analysis of a phototherapy extracorporeal membrane oxygen at orfortreating carbon monoxide poisoning. Lasers Surg. Med. 2023, 55, 390–404. [Google Scholar] [CrossRef] [PubMed]
- Gudnason, K.; Sigurdsson, S.; Jonsdottir, F. Multi-region finite element modeling of drug release from hydrogel based ophthalmic lenses. Math. Biosci. 2021, 331, 108497. [Google Scholar] [CrossRef] [PubMed]
- Wagner, S.; Gschwandl, M.; Nagl, R.; Fischlschweiger, M.; Zeiner, T. Simultaneous Modeling of Swellingand Heat Transfer in Polymers. In Proceedings of the NordPac 2024—60th Annual Microelectronics and Packaging Conference and Exhibition IEEE, Tampere, Finland, 11–13 June 2024. [Google Scholar]
- Gouze, P.; Puyguiraud, A.; Roubinet, D.; Dentz, M. Characterization and upscaling of hydrodynamic transport in heterogeneous dual porosity media. Adv. Water Resour. 2020, 146, 103781. [Google Scholar] [CrossRef]
- Tian, H.; Li, Q.; Dong, Y. Dissolved oxygen transfer from oscillatory flows to microbes in a permeable organic sediment bed. Int. J. Heat Mass Transf. 2020, 157, 119721. [Google Scholar] [CrossRef]
- Longo, S.; Longo, G.M.; Pavone, E.; Schiavone, F. Deterministic models of minority neutral particle kinetics close to a catalytic surface, based on the formalism of radiative transfer. Plasma Sources Sci. Technol. 2019, 28, 125008. [Google Scholar] [CrossRef]
- Neves, R.; Silva, A.; de Brito, J.; Silva, R.V. Statistical modeling of the resistance to chloride penetration in concrete with recycled aggregates. Constr. Build. Mater. 2018, 182, 550–560. [Google Scholar] [CrossRef]
- Gusev, S.A.; Nikolaev, V.N. Estimation of the Thermal Process in the Honeycomb Panel by a Monte Carlo Method. IOP Conf. Ser. Mater. Sci. Eng. 2018, 302, 012045. [Google Scholar] [CrossRef]
- Ducros, N.; da Silva, A.; Dinten, J.-M.; Peyrin, F. Approximations of the measurable quantity in diffuse optical problems: Theoretical analysis of model deviations. J. Opt. Soc. America. A Opt. Image Sci. Vis. 2008, 25, 1174–1180. [Google Scholar] [CrossRef]
- De Lemos, M.J.S.; Mesquita, M.S. Comparison of four thermo-mechanical models for simulating reactive flow in porous materials. Defect Diffus. Forum 2010, 297, 1493–1501. [Google Scholar] [CrossRef]
- Bialobrzewski, I.; Markowski, M. Mass transfer in the celeryslice: Effects of temperature, moisture content, and density on water diffusivity. Dry. Technol. 2004, 22, 1777–1789. [Google Scholar] [CrossRef]
- Tunca, C.; Ford, D.M. A hierarchical approach to the molecular modeling of diffusion and adsorption at nonzero loading in microporous materials. Chem. Eng. Sci. 2003, 58, 3373–3383. [Google Scholar] [CrossRef]
- Filipov, S.M.; Hristov, J.; Avdzhieva, A.; Faragó, I. A CoupledPDE-ODE Model for Nonlinear Transient Heat Transfer with Convection Heating at the Boundary: Numerical Solution by Implicit Time Discretization and Sequential Decoupling. Axioms 2023, 12, 323. [Google Scholar] [CrossRef]
- Koleva, M.N.; Vulkov, L.G. High-Order Approximations for a Pseudoparabolic Equation of Turbulent Mass-Transfer Diffusion. Axioms 2025, 14, 319. [Google Scholar] [CrossRef]
- Zhang, L.; Kong, H.; Zheng, H. Numerical manifold method for steady-state nonlinear heat conduction using Kirchhoff transformation. Sci. China Technol. Sci. 2024, 67, 992–1006. [Google Scholar] [CrossRef]
- Chernukha, O.; Chuchvara, A.; Bilushchak, Y.; Pukach, P.; Kryvinska, N. Mathematical Modelling of Diffusion Flows in Two-Phase Stratified Bodies with Randomly Disposed Layers of Stochastically Set Thickness. Mathematics 2022, 10, 3650. [Google Scholar] [CrossRef]
- Chaplya, Y.; Chernukha, O.; Bilushchak, Y.; Chuchvara, A.; Greguš, M.; Pukach, P. Advanced approach to mathematical modeling of the impurities diffusion in the process of waters oftening with limited particles sorption. Sci. Rep. 2025, 15, 5269. [Google Scholar] [CrossRef] [PubMed]
- Pukach, P.; Chernukha, O.; Chernukha, Y.; Vovk, M. Three-Dimensional Mathematical Modeling and Simulation of the Impurity Diffusion Process Under the Given Statistics of Systems of Internal Point Mass Sources. Modelling 2025, 6, 23. [Google Scholar] [CrossRef]
- Chernukha, O.; Pukach, P.; Bilushchak, H.; Bilushchak, Y.; Vovk, M. Advanced Statistical Approach for the Mathematical Modeling of Transfer Processes in a Layer Based on Experimental Data at the Boundary. Symmetry 2024, 16, 802. [Google Scholar] [CrossRef]
- Apostol, M. Equations of Mathematical Physics; Cambridge Scholars Publishing: Newcastle upon Tyne, UK, 2018; p. 250. [Google Scholar]
- Bakhrushin, V.E. Methods of Data Analysis; KPU: Zaporizhzhia, Ukraine, 2011; p. 268. (In Ukrainian) [Google Scholar]
- Vuchkov, I.; Boyadzhieva, L.; Solakov, E. Applied Linear Regression Analysis; Financy i Statistika: Moscow, Russia, 1987; p. 238. (In Russian) [Google Scholar]
- Gumbel, E.J. Statistics of Extremes; Dover Publications: Chicago, IL, USA, 2004; p. 400. [Google Scholar]
- Sneddon, I. The Use of Integral Transforms; Tata Mc Graw-Hill: NewYork, NY, USA, 1979; p. 539. [Google Scholar]
- Kamke, E. Differentialgleichungen Lösungsmethoden und Lösungen; Vieweg+Teubner Verlag: Stuttgart, Germany, 1983; p. 670. [Google Scholar]
- Abramowitz, M.; Stegun, I. Handbook of Mathematical Functions; National Bureau of Standards: Washington, DC, USA, 1972; p. 1046. [Google Scholar]
- Prudnikov, A.P.; Brychkov, Y.A.; Marichev, O.I. Integrals and Series. Elementary Functions; Nauka: Moscow, Russia, 1981; p. 800. (In Russian) [Google Scholar]
- Kartashov, M.V. Probability, Processes, Statistics; Kyiv University Publ.: Kyiv, Ukraine, 2007; p. 504. (In Ukrainian) [Google Scholar]
- Grigelionis, B.; Prohorov, Y.V.; Sazonov, V.V.; Statulevičius, V. Probability Theory and Mathematical Statistics; De Gruyter: Berlin, Germany, 2020; p. 624. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).