1. Introduction
Various tasks in engineering, ecology, biology, pharmacology, and other technical and technological research require the further development of approaches and methods for the mathematical description of non-equilibrium processes of a different physical nature [
1]. The relevance of the above problems is caused by the need to develop effective methods and estimates for predicting the distribution of man-made pollution, assessing the quality of drinking water and improving its treatment on an industrial scale, determining the effect of diffusion of drugs into tissue, diffusion of aggressive substances when assessing the reliability and durability of structural elements and components to prevent the destruction of the relevant materials, etc. Therefore, statistical modeling methods that combine classical approaches to mathematical modeling and methods of mathematical statistics are relevant. These studies enable us to obtain a reliable forecast of the processes taking place in environmental objects, industrial equipment units, the human body, etc., and to take the necessary measures in time to prevent their negative development. The main approaches and methods of statistical modeling have been developing quite rapidly over the past few decades [
2,
3].
These methods are widely used in environmental problems. The complexity of the study is explained by the multiphase behavior of the diffusion process. Over recent decades, researchers have developed various statistical models of multiphase flow, mass transfer, and solute transport to simulate the distribution of groundwater contamination [
4,
5], to explore the concepts of electrochemical nucleation, and in [
6,
7] to analyze meteorological fields obtained to reflect the characteristics of air pollution. Statistical modeling methods are actively used in biology and medicine. In [
8], a deterministic and statistical model describing synaptic communication in a biological molecular communication system was proposed and formulated for the diffusion equation in terms of a hypergeometric distribution. Özdural et al. [
9] presented a mathematical statistical model based on nonequilibrium conditions that describes the dynamic adsorption of proteins. A data-driven methodology for economical modeling and the analysis of dynamic cerebral autoregulation of arterial blood pressure, cerebral blood flow velocity, and their time derivatives was developed [
10]. The dynamics of diffusion processes were analyzed by examining the eigenvalues of the Markov random walk matrix in the data set. Blood propagation, blood flow dynamics, and diffusion were modeled in [
11] using the Monte Carlo method and the laminar Navier–Stokes equation. Computational statistical models help to eliminate the risks of excessive blood heating and improve the design of blood flow devices. Models of the diffusion release of drugs from hydrogel ophthalmic lenses were analyzed using probabilistic modeling methods in [
12].
The statistical modeling technique has shown particular efficiency in engineering problems. The aim of paper [
13] was to model heat and mass transfer in electronics problems, whose mathematical models are based on the perturbed circuit equation. Using the precise numerical statistical modeling of flow and transfer, Gouze et al. [
14] studied the increase in the scale of passive dissolved matter transfer in a carbonate rock sample characterized by microporous areas. Statistical modeling was used in [
15] to study the transfer of dissolved oxygen with a high Schmidt number from oscillating turbulent flows to a permeable layer of microbial sediment. The mathematical model of the kinetics of neutral particles diluted in a gas or plasma and reacting with a catalytic surface is formulated as a boundary value problem for the reaction–diffusion equation. However, this approximation does not always adequately reflect the transfer process. As an alternative, Longo et al. [
16] used Monte Carlo simulation, taking into account the problem of statistical error. The aim of the study [
17] is to develop probabilistic analytical models for predicting the characteristics of concrete in terms of chloride penetration. Gusev and Nikolaev [
18] proposed a method for assessing the thermal state of insulating cellular panels, which is described by a boundary value problem for a parabolic equation with a discontinuous diffusion coefficient and boundary conditions of the third kind. The problem is solved by means of a probabilistic representation of the functional expectation of the diffusion process corresponding to the boundary value problem. The possibility of the non-invasive determinationof the diffusion approximation parameters in the radiation transfer theory under the condition of fixing the input radiation power was researched by numerical and statistical modeling in [
19]. De Lemos and Mesquita [
20] presented a numerical simulation of the combustion of an air/methane mixture in porous materials using a model that accounts for intra-pore levels of turbulent kinetic energy. Statistical tests and corresponding mathematical modeling were used in [
21] to determine the effect of temperature, moisture content, density, and porosity of the material on the effective moisture diffusion coefficient during the convective drying of celery root. A statistical approach for modeling small molecules at non-zero concentrations in microporous materials was developed in [
22].
It should be noted separately that statistical modeling in recent years has received its further development on the basis of approaches based on various nonlinear regression models [
23]. The study [
24] proposes a new method for calculating nonlinear heat conduction processes under Robin boundary conditions. The article [
25] develops and implements an effective numerical scheme for studying solutions to the nonlinear diffusion mass transfer problem.
In view of the need to solve the problems mentioned above and a number of other similar problems, a certain mathematical apparatus for statistical modeling of reaction–diffusion problems has been developed. A systematic presentation of methods and approaches to statistical modeling of random diffusion fluxes of impurity particles in a two-phase stratified strip with a stochastic arrangement of phases and random thickness of inclusion layers is given in [
26]. Similar diffusion models are important in the design of composite layered materials [
27], the study of filter throughput properties [
28], the prediction of the spread of pollutants in the environment, etc. The calculation formulas for the diffusion flux averaged over an ensemble of phase configurations and over the thickness of the inclusion were obtained.
The analysis of the problem has shown that it is not always possible to correctly impose boundary conditions on the boundaries based on physical considerations, even in a fairly general form. This is due to the complexity and insufficiency of the relevant research, and therefore, the analysis and necessary generalizations are lacking. Following the above, a mathematical model of the transfer process in the layer under the condition of the given experimental data on a part of the layer boundary was developed and studied in [
29]. From the experimental data, a boundary condition is constructed, the corresponding mixed problem is formulated and solved, the influence of the statistical characteristics of the sample of experimental data on the solution is analyzed, and a two-sided statistical estimate of the solution is determined. The reliable intervals for the coefficients of the regression equation and the corresponding reliable intervals for the required function are established in the case of different samples in terms of size and variance.
This paper considers a parabolic boundary value problem that describes the processes of heat, mass, charge, etc., transfer in a layer when experimental data on the required function are available at one of the boundaries under a non-zero initial condition.The main focus of the research in this article is to expand and generalize the range of possible statistical modeling approaches to studying the linear transport models, presented in [
26,
27,
28,
29]. In particular, developing the ideas and approaches of the paper [
29], we present an effective method for establishing reliable intervals for solutions of parabolic differential equations with experimental data at the boundary of the body and analyze the influence of the reliability level on the width of reliable intervals. Developing and generalizing the methodology of the above-mentioned papers, a two-sided critical domain for the solution is constructed. The aim of the article is to conduct a numerical analysis of the solution of the boundary value problem depending on the statistical characteristics of the sample. The article also investigates the influence of the reliablelevel on the reliable intervals for the linear regression and for the solution of the parabolic boundary value problem.
2. Formulation of a Boundary Value Problem for a Parabolic Equation in a Layer with Experimental Data at the Boundary
In the layer with the thickness
, the process described by the function
, takes place, being a solution of the second-order partial differential equation [
30]
where
is a constant coefficient,
is time, and
is a spatial coordinate.
We assume that at the initial moment the function
and at
the upper surface of the layer is subjected to a stable source
At the lower boundary of the layer, the values of the function
were obtained experimentally at
time points (
Table 1).
Then a linear regression model is built [
31,
32], searching for its coefficients using the least squares method [
33]
The coefficients
and
are calculated using the following formulas
where
. As a result, the boundary condition at
will take the form
3. Solving a Parabolic Boundary Value Problem Under the Condition of Linear Regression on the Layer Boundary
To solve the problem (1)–(3), (7) by way of substitution
one can reduce it to a problem with zero boundary conditions with respect to the function
The required function
satisfies the initial condition
and zero boundary conditions
Applying to the boundary value problem (9)–(11) the finite integral Fourier sin-transform [
34] (
,
), the next problem is obtained in images
The solution to problems (12) and (13) is the following expression [
35]
After the Fourier sin-transform [
36] is applied to the relationship (14), one can obtain
Taking into account (8), the solution to the problem (1)–(3), (7) is obtained in the form
Formula (15) can be specified taking into account that it is a linear regression, i.e., Formula (4) is valid. Then
and, accordingly, the solution becomes
Since the sample of the experimental data for the required function is given in the interval , the resulting formula (16) is valid for this time period.
4. Two-Sided Statistical Estimation for the Solution of a Boundary Value Problem
To determine the two-sided estimate of the solution to the problem (1)–(3), (7), represented using linear regression on the layer boundary
, let us denote
In view of Formula (16), one can obtain
It should be noted that for exponent
and
the following estimation is true
Hence, for the positive function
(for example, if
;
), the next estimation can be obtained
For the negative function
(in particular, in the case of
;
), the following inequality is true
Summing the series included in the last two inequalities [
37], we obtain for
and in case of
Then, from inequalities (19)–(21), taking into account the relation (17), one can obtain a two-sided estimate for the required function :
Substituting the expressions of the linear regression coefficients in the form of (5) and (6) into inequalities (22) and (23), we obtain
Let us study the asymptotic case at
. Using the asymptotic relationship [
29]
the following asymptotic inequalities are written
Note that the asymptotic inequalities (26) and (27) imply that the change in sample size affects the bounds of the two-sided estimates (24) and (25) for the function . In the case of experimental values of the same sign as the sample size increases, both boundaries of the functional intervals for (24) and (25) increase for , , i.e., the bilateral estimate of the solution shifts towards higher values . If one expresses the linear regression coefficients in terms of covariance and variance, then the two-sided estimation of the function will be in the form of
is the covariance of values
and
[
38]; and
is the variance of the variable
. Note that in the case of
, an increase in the covariance of values
and
leads to a shift in the bilateral evaluation for
upwards, and the interval itself increases. At the same time, the increase in the dispersion of values
shifts the value downwards
, and the width of the bilateral evaluation narrows. For
the increase of
leads to a downward shift in the bilateral estimate, but the interval itself increases. The increase of
shifts upwards the values
, and the width of the bilateral evaluation narrows.
Note that the linear regression coefficients can also be presented using the correlation coefficient
[
29]. The two-sided estimation of the solution to the initial boundary value problem will then take the form:
where , and are standard deviations of values and , respectively. The obtained bilateral estimate implies that an increase in the correlation coefficient leads to an increase in the bilateral estimation interval for and for to shift the interval upwards, and for to shift the interval downwards.
5. Reliable Intervals for Solving a Boundary Value Problem
To find the limits of the reliable intervals for the function
let us represent it in the form
where in the case of linear regression [
39]
The lower and upper bounds of the reliable regression zone are obtained by connecting the points with coordinates
and
, respectively. We would like to note that
is the solution of the equation
where
is the number of freedom degrees,
;
is the Gamma function [
36]. And “+” in Formula (28) refers to the upper limit of the critical domain, and “−” refers to the lower limit of the critical domain.
Relationship (30), whereas
is a natural number, can be represented as
where
,
[
33]. From formula (31), it follows that assample size
increases, the values of the subintegral function decrease, so in order for the area under the subintegral function to remain constant (namely
), the integration interval must increase, and, accordingly, the value of
must decrease. However, the value
increases as the level of reliability
increases. Also, the width of the reliable interval for the function
increases with increasing
, since the reliability level is
and with increasing
deviation
increases. By substituting expression (28) into relation (15), the following is obtained:
Similarly to (31), the solution of the boundary value problem (1)–(3), (7) is represented in the form
Taking into consideration the structure of expression (32), we have
Substituting the expression
by Formula (29) into Formula (33), one can obtain
After integration, the half-width of the reliable interval for solving the original boundary value problem (1)–(3), (7) is obtained
Note that the function is directly proportional to the value of Student’s t-test and standard deviation . Notably, at , and at ; therefore , if , then .
6. The Two-Sided Critical Domain for the Predicted Value of the Built Linear Regression Model
To set a two-sided critical domain for function
as the solution to the boundary value problem (1)–(3), (7), first, using the method given in [
32], we find the two-sided critical domain for the linear regression
(4). According to the given level of significance
, the reliability level
is determined. According to the table of experimental data (
Table 1), the matrix of regressors can be written as
For the predicted value of the linear regression
, variance is determined by the formula
where
. The expected bilateral critical domain for
is obtained by substituting the estimate of the standard deviation
(34) into
. Then the true value of the required function on the lower boundary of the layer
with probability
varies within the range of [
32]:
The determinant of the next information matrix
is obtained in the form
The lengths of the bilateral critical domains will be different at different points of the factor space, since the vectors are different in these points, i.e., the accuracy of the predicted value of the linear regression may vary at different time moments .
Let us find the boundaries of the two-sided critical domain for the function .
Suppose
where in the case of linear regression, which follows from Formula (35),
So
“+” refers to the upper limit of the critical domain in Formula (36), and “−” refers to the lower limit of the critical domain.
Having studied the influence of the
on the coefficients
,
(38), note that as the number of measurements
increases, these coefficients decrease, moreover
Then increasing the number of experiments in the same interval at a constant value leads to a narrowing of the bilateral critical domain (35), where with probability experimental data are likely to fall. At the same time an increase in the standard deviation of , with the same number of measurements leads to an increase in the two-sided critical domain for the linear regression function . The influence of the variance of the variable on the bilateral critical domain is similar to the influence of ; however, growth by/into the same number slows down the growth rate of this interval. Since the coefficient is directly proportional to the value of Student’s t-test , then with an increase in the reliability level , the value of the coefficient increases.
7. Finding a Two-Sided Critical Domain for the Solution of a Boundary Value Problem
Substituting the representation of the function
(36) into relationship (15), the boundaries of the two-sided critical domain for solving the boundary value problem (1)–(3), (7) are obtained
In the case of imposing a linear regression as a condition on the lower boundary of the layer, taking into account Formula (16), Formula (39) can be represented as
Here
. Undefined integrals
and
are not solvable in elementary functions, so we will find their approximate expressions. Since function
is continuously differentiable the required number of times, we will expand it into a Taylor series over the interval of the given experimental measurements
in the neighborhood of point
from this interval:
Assume
. For the Taylor series to converge, the following condition must be ensured
or
[
37]. Limiting ourselves to the first three terms of the Taylor series, let us analyze the function
, calculated by Formula (41) depending on the value of
.
The total residual term of the series in this case is calculated by the formula
Table 2 shows the exact values of the function
and the values of its expansion in the Taylor series (42) with a limit of three terms for three levels of reliability, calculated for
0.001, 0.003 in the interval
.
Figure 1 demonstrates their corresponding total residual terms
, calculated by (37), where
is the middle of the interval
. Curves 1–3 correspond to the values
, 0.95, 0.99 for
(
Figure 1a) and
(
Figure 1b). Since varying the parameters within a wide range gives a similar result
(
Figure 1); therefore, we assume that the restriction to three terms of the series (41) sufficiently approximates the function
(37).
In case
, where
, and
, the following is true
Then, for the bilateral critical domain, the following bounds are obtained
Consider that . Then, for long-term processes described by the original boundary value problem, the greater the , the greater the width of the bilateral critical domain, i.e., functions have no stationary mode (time asymptotic). The width of the bilateral critical domain is most influenced by the coefficient . In particular, the smaller the value of , the narrower the two-sided critical domain for the same level of reliability . The larger the variance of the predicted value of the linear regression , the wider the two-sided critical domain.
8. Influence of Statistical Characteristics of a Sample on Reliable Intervals and Bilateral Critical Domain of a Solution to a Boundary Value Problem
Let us analyze the influence of the statistical characteristics of experimental data on the required function at the lower boundary of the layer using concrete examples. Samples of experimental data were obtained for both the uniform and non-uniform distribution of the study time interval [
39]. The basic parameters are
,
,
, and
. The series is calculated with an accuracy of
. Let us consider the cases of samples of large and small volumes characterized by large or small variance.
Figures 2, 4, 6, 8, 10, and 12 demonstrate linear regression for the relevant sample (Figures 2a, 4a, 6a, 8a, 10a, and 12a) and solutions to the boundary value problem (1)–(3), (7) (Figures 2b, 4b, 6b, 8b, 10b, and 12b). In Figures 2a, 4a, 6a, 8a, 10a, and 12a, solid lines (curves 1) indicate the function , the dashed and dotted lines indicate its reliable intervals (curves 2−, 3−, 4− and 2+, 3+, 4+), and the green dots indicate the experimental data presented in the corresponding table. Figures 2b, 4b, 6b, and 10b show solutions to the initial boundary value problem (solid lines) for long time intervals at time moments 0.1, 0.5, 2 (curves 1–3). Figures 8b and 12b show the same solutions for short time intervals at time moments 0.1, 0.5, 1 (curves 1–3). The curves (dashed lines) with the index “+” are calculated for the upper limits of the reliable intervals , and with the index “−” for the lower ones . The lower indices 1, 2, 3 correspond to the values of the reliability level , 0.95, 0.99. Figures 3, 5, 7, 9, 11 and 13 show solutions to the initial boundary value problem , that are normalized to the value of the function on the upper boundary of the layer , and the corresponding bilateral critical domains calculated by the Fisher criterion, for 0.1 (Figures 3a, 5a, 7a, 9a, 11a and 13a) and 0.5 (Figures 3b, 5b, 7b, 9b, 11b and 13b). Curves 2−, 2+ are calculated for , curves 3−, 3+ are calculated for , and curves 4−, 4+ are calculated for .
Tables 4, 6, 8, 10, 12 and 14 contain the maximum widths of the two-sided critical domain of the boundary value problem (1)–(3), (7) for different time moments and , 0.95, 0.99 for six samples.
8.1. Sample I: Large Sample, Long Time Interval, Large Variance
Sample of experimental data
with volume
is shown in
Table 3.
The sample range is
= [0, 2.9716], and the sample variance is
= 0.893931523259907. Let us build a regression based on the data in
Table 3 using the least squares method. Based on the view of the correlation field, we assume that the time dependence of the required function on the lower boundary of the body
is linear. According to the sample data, the coefficients of the linear regression (4) are
= 0.835387001287001 and
= 0.16861996996997. Now, let us find reliable intervals with reliability
for the linear regression coefficients. For the coefficients
and
reliable intervals with reliability
are calculated within reliable limits
(0.792324843411; 0.878449159163);
(0.080983805814; 0.256256134126); with reliability
are calculated within reliable limits
(0.783632562191029; 0.887141440382974);
(0.063294070291443; 0.273945869648497); with reliability
are calculated within reliable limits
(0.765903964133865; 0.904870038440137); and
(0.0272144494313441; 0.310025490508596). For short times of the transfer process described by the boundary value problem (1)–(3), (7), the function
monotonically decreases (curve 1,
Figure 2b). As increasing
values of the function also increase, the maximum is formed at the point
(curve 2,
Figure 2b). This maximum eventually shifts to the lower boundary of the body, for example
. Afterwards, the behavior of the function
but its values in the entire body area increase significantly (curve 3,
Figure 2b). Boundaries of the bilateral critical domain
, calculated by Formula (43), for this sample and all others are given in
Appendix A.
As the transfer time increases, not only do the values
increase, but the widths of the reliable intervals also increase
(
Figure 2b). Thus, with an increase in
value from 0.5 to 2, the maximum widths of the reliable intervals increase more than double, regardless of the value of
. For example,
. At the same time, increasing the level of reliability
from 0.9 to 0.99 leads to the expansion of the reliable interval to 61%. In particular,
.
In contrast to the reliable intervals, the bilateral critical domain of the problem solution initially decreases (
Figure 3), reaching its narrowest values in the middle of the time interval
; further, the bilateral critical domain expands, symmetrically to the narrowing in the initial time interval (
Table 4). Thus, for short times, the width of this domain
decreases to 17%, namely
0.833 for
(
Figure 3a,b); for medium times, the difference between
and
decreases by 16.6%. The largest widths of the bilateral critical domain at increasing
increase by 60%. For example,
.
8.2. Sample II: Large Sample, Long Time Interval, Small Variance
Sample of experimental data
with volume
is shown in
Table 5.
The sample range is [0. 0.694], and sample variance is = 0.174264028380218.
For the regression coefficients = 0.155924066924067 and = 0.218493993993994, reliable intervals with reliability are calculated within the following reliable limits (0.139918763614950; 0.171929370233184); (0.185921464949008; and 0.251066523038980); with reliability are calculated within the following reliable limits (0.136688; 0.17516); and (0.179346560289016; 0.257641427698971); with reliability are calculated within the following reliable limits (0.130098675459337; 0.181749458388797); and (0.165936519928265; 0.271051468059722).
As the transfer process increases in time, the function value
increases in the entire body area (
Figure 4b). For short and mid-long times
is a monotonically decreasing function. The local maximum of the function begins to form at moment
0.6 in the point
. With the increase in
this maximum grows, shifting to the lower boundary of the layer (
) and becomes global (
Figure 4b). With the increase in time, the width of the reliable interval
grows (
Figure 4b), similarly for different values
. For example,
.
The two-sided critical domain of the sample is also initially narrowed (
Figure 5), and then, in the second half of the time interval
is extended (
Table 6). The width of this domain
for short times decreases to 17%, namely
0.833 for
(
Figure 5); for long times, the difference between
and
decreases by 16.6%. There is also symmetry in the widths
in time intervals
and
. The largest widths of the bilateral critical domain at increasing
increase by 60% too. For example,
. For samples
I and
II, the growth rate of the reliable interval
and reduction in the bilateral critical domain
, and then the corresponding growth is the same.
8.3. Sample III: Small Sample, Long Time Interval, Large Variance
Sample of experimental data
with volume
is shown in
Table 7.
The sample range is
[0, 4.4371], and sample variance is
= 1.47815809975753. For regression coefficients
1.361503496503 and
= −0.158314102564102 reliable intervals with reliability
are calculated within the following reliable limits
(1.2942027604045400; 1.4288042326024500); and
(−0.289421049582769; −0.0272071555454357); with reliability
are calculated within the following reliable limits
(1.27876771583259; 1.4442392771744); and
(−0.319489689069116; 0.00286148394091157); with reliability
are calculated within the following reliable limits
(1.24382131097189; 1.47918568203511); and
(−0.387567940751664; 0.0709397356234593). For the experimental data on the values of the required function at the lower boundary of the layer, presented in
Table 7, also for short transfer times, the function
monotonically decreases (Curve 1,
Figure 6b). As increasing
values of the function increase and the maximum is formed at the point
(Curve 2,
Figure 6b), that grows over time and shifts to the lower boundary of the layer. In particular, at
the maximum value of the function is reached at the point
. For long times, the values of the function
increase significantly in the entire body domain (Curve 3,
Figure 6b). With the prolongation of time, the width of the reliable interval
increases (
Figure 6b), in the same way for different values
.
The bilateral critical domain of the problem solution also reduces over time (
Figure 7) and then expands (
Table 8) for this sample. The width of this domain
for short times decreases to 17%, namely
for
(
Figure 7); for long times, the difference between
and
decreases by 10.6%. The larger widths of the bilateral critical domain with increasing
increase by 75%. For example,
. Symmetry in relation to the widths
of time intervals
and
is also present.
8.4. Sample IV: Small Sample, Short Time Interval, Large Variance
Let us now consider a sample of experimental data
with volume
over a short time interval, as
Table 9 shows.
Sample range [0, 5.9291], and sample variance = 2.18380650788105. For the regression coefficients 5.9880034965035 and = −0.2559269230769220 reliable intervals with reliability are calculated within the following reliable limits (5.4663064488105100; 6.5097005441964900); and (−0.5946954178281560; 0.0828415716743118); with reliability are calculated within the following reliable limits (5.34665817078342; 6.62934882222358); and (−0.672390063074964; 0.16053621692112); with reliability are calculated within the following reliable limits (5.07576312118831; 6.90024387181869); and (−0.848298108927857; 0.336444262774013).
For the experimental data on the values of the required function at the lower boundary of the layer, presented in
Table 9, the formation of a local maximum of the function
already begins for short times. This
forms in the middle of the body, grows over time, and shifts to the lower border of the layer. For example,
and
0.635718339 (Curve 1,
Figure 8b),
and
4.4428612 (Curve 2,
Figure 8b),
and
9.007544363 (Curve 3,
Figure 8b). The reliable interval width
with the prolongation of the process time increases (
Figure 8b), and is the same for the different values
. For example,
1.93. For this sample, the bilateral critical domain of the solution also narrows with time (
Figure 9), and then enlarges symmetrically on the intervals
and
(
Table 10). The width of this domain
for short times reduces to 25%, namely
= 0.75 for
(
Figure 9); for medium times, the difference between
= 0.5 and
increases by 61.5%. The largest widths of the bilateral critical domain at increasing
also increase by 75%. For example,
8.5. Sample V: Small Sample, Long Time Interval, Small Variance
Sample of experimental data with volume is obtained over a long time interval.
Sample range
[0, 0.794], and sample variance
= 0.223456914022016. For the regression coefficients
= 0.180180811808118 and
= 0.26010024600246 reliable intervals with reliability
are calculated within the following reliable limits
(0.128690059337185; 0.231671564279051); and
(0.154680344031757; 0.365520147973162); with reliability
are calculated within the following reliable limits
(0.116880945418082; 0.243480678198154); and
(0.130502883575815; 0.389697608429105); with reliability
are calculated within the following reliable limits
(0.0901439914598696; 0.270217632156367); and
(0.0757628199719453; 0.444437672032974). For the experimental data on the values of the required function at the lower boundary of the layer, presented in
Table 11, for short times, the function
monotonically decreases (Curve 1,
Figure 10b). The local maximum of the function
is formed at the point
(Curve 2,
Figure 10b) and, as in the previous cases, increases with time and shifts to the lower boundary of the layer (Curve 3,
Figure 10b). For example,
and
(Curve 3,
Figure 10b).
Reliable interval width
with the prolongation of the process time increases (
Figure 10b); moreover, this occurs in the same manner for the different values
. For example,
For this sample, the bilateral critical domain of the solution of the problem also narrows with time (
Figure 11), and then widens symmetrically in the intervals
and
(
Table 12). The width of this domain
for short times decreases to 17%, namely,
for
(
Figure 11);for average times, the difference between
and
decreases by 16.07%.
Larger widths of the bilateral critical domain with increasing increase by 75%. For example,
8.6. Sample VI: Small Sample, Short Time Interval, Small Variance
Sample of experimental data
with volume
is obtained over a short time interval in the form of
Table 13.
Sample range
= [0, 0.02], and sample variance
= 0.000044820833333. For the regression coefficients
= 0.0181030150250685 and
= 0.00291602180076567 reliable intervals with reliability
are calculated within the following reliable limits
(0.014367706354996; 0.02183832369514140); and
(0.000667626599415; 0.00516441700211638); with reliability
are calculated within the following reliable limits
(0.0135110343606834; 0.0226949956894535); and
(0.0001519698301164; 0.0056800737714150); with reliability
are calculated within the following reliable limits
(0.01157144771378340; 0.02463458233635350); and
(−0.00101552600347659; 0.00684756960500794). For the experimental data on the values of the required function at the lower boundary of the layer, presented in
Table 13, for the entire time interval
the function
monotonically decreases (
Figure 12b). And time
is already a stationary one
. As the time approaches to
the growth rate of function
significantly slows down (
Figure 12b). The reliable interval width
for this sample is substantially lower than for the previously discussed ones. With prolongation of the process time, the reliable interval width increases slightly (
Figure 12b). For example,
1.43. For this sample, the bilateral critical domain of the solution is much narrower than in the cases of other samples. But also, for sample VI, the width of the bilateral critical domain also narrows with time (
Figure 13) and then widens symmetrically at the intervals
and
(
Table 14). The width of this domain
for short times decreases to 25%, namely
0.746 for
(
Figure 13); for long times, the difference between
and
increases by 34%. The largest widths of the bilateral critical domain at increasing
increase by 73%. For example,
. For the considered samples for all time moments
the reliable interval and bilateral critical domain for the function
are symmetrical, i.e.,
and
for
,
.
For all types of samples, the following can be observed: the lower the values of the reliability level
, the smaller the reliable intervals for both the linear regression
, and the solution of the boundary value problem (1)–(3), (7)
(
Figure 2,
Figure 4,
Figure 6,
Figure 8,
Figure 10 and
Figure 12). The same is true for
: the higher the level of reliability, the wider the bilateral critical domain (
Figure 3,
Figure 5,
Figure 7,
Figure 9,
Figure 11 and
Figure 13). Point
with the largest width of the reliable interval for the function
is the same for all six samples regardless of the level of reliability
, but may differ at different moments in time. So,
0.8 for
,
0.725 for
, and
(
Figure 2b,
Figure 4b,
Figure 6b and
Figure 10b). At the same time, the largest width of the bilateral critical domain is observed at the lower boundary of the layer, i.e.,
1, and is constant over the entire time period
of the process under study (
Figure 3,
Figure 5,
Figure 7,
Figure 9,
Figure 11 and
Figure 13). Point
, where the widths of the reliable interval and the bilateral critical domain are equal to zero
, is located at the upper boundary of the layer, since the value of the required function at this boundary is known and constant in time.
9. Conclusions
The presented study proposes a methodology for studying physical, chemical, or biological processes, that can be described by parabolic boundary value problems with incomplete data at the body boundary. A specific boundary value problem is formulated for a layer with a non-zero initial condition, the action of a steady source on one boundary of the body, and a sample of experimental data on the required function on the other. Based on the experimental data, a linear regression model is built using the least squares method, being considered as a boundary condition. The solution to the boundary value problem is found by applying the finite integral Fourier transform. It is obtained for a general regression model containing integral terms of the regression function and is specified for the case of linear regression based on experimental data at the lower boundary of the layer.
A two-sided statistical estimate of the solution to the boundary value problem is determined through the coefficients of linear regression, which is analyzed in relation to the influence of sample size and covariance. The corresponding reliable intervals for the linear regression and the required function are determined on the basis of the obtained solution to the problem with a given level of reliability.
It is shown that the higher the absolute values of the required function, the greater the width of the reliable interval. For short times, the influence of the reliable interval for the slope of the regression equation is imperceptible, while the influence of the reliable interval for the free term of the regression is many times greater, especially in the vicinity of the lower boundary of the layer. It is also noted that the smaller the variance of the sample of experimental data, the smaller the width of the reliable interval for solving the formulated boundary value problem.
The formula for determining the bilateral critical domain based on the Fisher criterion is obtained and analyzed. The influence of the statistical characteristics of the sample of experimental data on the required function on the lower boundary of the layer is studied on specific examples. The cases of samples with large and small volumes, characterized by large or small variance, at large or small time intervals are considered.
The numerical analysis of the solution to the boundary value problem depending on the statistical characteristics of the sample is carried out. It is established that a larger variance of the time variable leads to a decrease in the solution values throughout the entire domain of the body, whereas a higher variance of the responses results in an increase in the values of the desired function. An increase in the correlation coefficient also leads to an increase in the values of the solution and the formation of its local or global maximum in the lower half of the layer. The influence of the level of reliability on the reliable intervals for linear regression and for solving a parabolic boundary value problem, as well as for a two-sided critical domain constructed to solve the problem, is studied. It is shown that for lower values of the reliability level, the reliable intervals are narrower for all types of samples, and are also narrower for the corresponding bilateral critical domains.
The developed methodology can be generalized to a wider class of inverse and direct boundary value problems for parabolic partial differential equations, including those with other types of boundary conditions (e.g., Neumann or Robin conditions), variable coefficients, or time-dependent sources. Moreover, the approach can be adapted for models incorporating nonlinear regression or experimental data obtained not only at boundaries but also within the domain. The integration of probabilistic modeling and statistical analysis holds potential for studying transport processes in more complex systems, such as multilayer or randomly inhomogeneous media, where only partial or uncertain data are available. Future research may focus on these generalizations to enhance the applicability of the proposed method.
Another promising direction is the refinement of the statistical modeling of transport processes using more complex regression structures and the incorporation of stochastic effects, especially in systems with multiphase or spatially heterogeneous properties and incomplete boundary or initial data.
These directions open new possibilities for integrating statistical regression with physical modeling in the analysis of transport phenomena under uncertainty.
The results were obtained as part of a grant from the Ministry of Education and Science of Ukraine (project number 0123U101691).