1. Introduction
Linear systems play a pivotal role in various fields of mathematics, science, engineering, and technology, showcasing their fundamental significance [
1,
2,
3,
4,
5]. The study and solution of linear systems of algebraic equations provide a versatile framework for modeling and analyzing a wide array of real-world phenomena [
6,
7,
8,
9,
10,
11,
12,
13,
14]. These systems offer a structured approach to represent relationships and dependencies among variables, enabling the formulation of mathematical models for diverse applications, including physics, economics, biology, and computer science. Linear systems also serve as the foundation for understanding more-complex mathematical structures, paving the way for advanced techniques and methodologies in numerical analysis, optimization, and data analysis. The ability to efficiently solve linear systems is crucial for addressing practical problems, making them an indispensable tool in the toolkit of researchers, scientists, and engineers across various disciplines. Whether in simulating physical systems, optimizing resource allocation, or processing data, the significance of linear systems lies in their capacity to provide a systematic and powerful framework for problem-solving and decision-making.
Beyond their foundational role in mathematics, linear systems offer a practical and indispensable tool for solving real-world problems [
15,
16,
17,
18]. The linear nature of these systems simplifies their mathematical representation, making them particularly amenable to analytical and computational methods. In engineering, for instance, linear systems are pervasive in control theory, signal processing, and structural analysis. The ability to solve linear systems efficiently contributes to the design and optimization of systems ranging from electronic circuits to mechanical structures. Moreover, in economics and finance, linear models are frequently employed to analyze relationships between variables, aiding in decision-making processes. The significance of linear systems extends to scientific research, where they are utilized for data analysis, image processing, and simulations. In essence, the versatility and applicability of linear systems underscore their significance across a broad spectrum of disciplines, emphasizing their role as a cornerstone in the theoretical understanding and practical application of mathematical concepts in the real world.
The significance of linear systems transcends their mere existence as mathematical constructs; rather, they embody a fundamental and pervasive concept with profound implications across various domains of human inquiry. At its core, a linear system encapsulates relationships, dependencies, and interactions within a structured framework, providing a mathematical language to express and decipher complex phenomena. In physics, the laws governing many natural phenomena exhibit linearity, allowing scientists to model and predict behavior. In the intricate world of engineering, linear systems are the backbone of control theory, enabling the design of stable and efficient systems, whether in aerospace engineering or telecommunications.
Moreover, linear systems [
19,
20,
21] serve as a gateway to understanding more-sophisticated mathematical structures. The study of linear algebra and the solutions to linear systems lay the groundwork for advanced mathematical techniques that underpin numerical analysis, optimization, and statistical modeling. In essence, linear systems act as a bridge between theoretical abstraction and real-world problem-solving, facilitating a deeper comprehension of intricate mathematical concepts. The practical implications of efficiently solving linear systems are profound. From the optimization of resource allocation in logistics to the simulation of complex physical processes, the ability to navigate and manipulate linear systems empowers researchers and engineers to tackle challenges of increasing complexity. In economics, linear models elucidate intricate relationships within financial systems, offering insights crucial for informed decision-making.
Linear Algebraic Equations (LAEs) [
22,
23,
24,
25] establish the fundamental scaffolding for an array of scientific and engineering pursuits, effectively translating abstract concepts into tangible, real-world applications. Whether navigating intricate simulations in the realm of computational fluid dynamics or delving into intricate optimizations within the field of structural engineering, LAEs provide a mathematical prism that allows us to not only comprehend but also address a diverse range of complex challenges. As computational power continues to burgeon and the intricacy of problems intensifies, attention has gravitated towards large-scale LAEs. These expansive systems not only push the boundaries of analytical methods but also necessitate a computational infrastructure capable of grappling with their sheer magnitude. Within these vast systems lie intricate interrelationships and voluminous datasets, amplifying the critical nature and complexity of their resolution. Successfully addressing these extensive systems often hinges on the application of sophisticated stochastic techniques, highlighting the necessity for advanced and nuanced approaches [
26,
27].
LAEs serve as a foundational framework in mathematics and its applications, enabling the modeling and analysis of diverse systems and phenomena. The widespread use of LAEs underscores their significance as a foundational and unifying mathematical framework. As computational power continues to advance, the application of linear algebra in interdisciplinary research and problem-solving is likely to expand, further cementing its role as an indispensable tool in both theoretical and applied mathematics. Their broad applicability and the development of specialized techniques for their solution make them a central topic in both theoretical and applied mathematics. The application of linear algebraic equations extends across a wide array of fields, demonstrating their versatility in addressing complex problems and providing essential tools for understanding, modeling, and solving diverse systems and phenomena. Here are some broader perspectives on the applications of LAEs [
28,
29,
30,
31,
32].
In engineering disciplines, LAEs are fundamental to the analysis and design of structures, circuits, and mechanical systems. They play a pivotal role in solving problems related to fluid dynamics, heat transfer, and electrical networks. In physics, linear algebra is essential for describing quantum mechanics, classical mechanics, and the behavior of physical systems. Linear algebra is at the core of many algorithms and techniques in computer science. From image processing and computer graphics to machine learning and artificial intelligence, linear algebraic equations are used for tasks like image transformation, data representation, and solving systems of equations arising in algorithmic processes. LAEs are applied in economic modeling and financial analysis. Systems of linear equations are used to model economic relationships, optimize resource allocation, and analyze market trends. Techniques such as input-output analysis and linear programming rely on linear algebraic methods. Linear algebra is a fundamental tool in statistics and data analysis. Methods such as linear regression, principal component analysis, and singular value decomposition involve solving and manipulating systems of linear equations. These techniques are crucial for extracting meaningful information from large datasets. LAEs play a central role in optimization problems and control systems engineering. Techniques like linear programming and control theory heavily rely on linear algebraic methods to optimize processes, design controllers, and ensure stability in dynamic systems. Cryptography, the science of secure communication, employs linear algebra in areas such as coding theory and the design of encryption algorithms. Techniques like error-correcting codes and cryptographic protocols often involve solving linear algebraic equations to ensure the security and integrity of transmitted information. LAEs find applications in biomedical imaging, genetics, and bioinformatics. Techniques like image reconstruction in medical imaging, genetic mapping, and the analysis of biological networks involve solving linear systems to interpret and understand complex biological phenomena. Environmental scientists use LAEs to model and analyze environmental systems. This includes studying the flow of pollutants in ecosystems, analyzing climate models, and optimizing resource management strategies to address environmental challenges.
Monte Carlo (MC) algorithms [
33,
34,
35] for LAPs leverage probabilistic sampling techniques to address challenges associated with large-scale linear systems. These algorithms are particularly useful when traditional methods become computationally expensive or impractical. MC methods are often incorporated into randomized numerical linear algebra algorithms. These methods introduce randomness into the computation, allowing for efficient approximations of matrix factorizations and solutions to linear systems. These techniques are useful for tasks like low-rank approximation, singular value decomposition, and solving linear systems. MC algorithms for LAPs can be applied to solve systems of linear equations. Instead of directly computing the solution, these algorithms use random sampling to approximate the solution or components of the solution vector. This is especially advantageous when dealing with large and sparse matrices. MC integration techniques are adapted for LAPs, where the goal is to approximate integrals involving high-dimensional vectors or matrices. This approach involves random sampling to estimate expectations or integrals, offering a stochastic alternative to deterministic methods. In some applications, matrices involved in linear algebraic problems are too large to be explicitly formed or stored. Matrix-free MC methods address this challenge by performing matrix-vector products on-the-fly, utilizing random samples to compute these products without explicitly representing the entire matrix. MC algorithms provide a stochastic approximation framework for solving LAPs. By iteratively updating the solution based on random samples, these algorithms converge to an approximate solution over multiple iterations. This approach is particularly relevant for large-scale problems where deterministic methods may be impractical. MC algorithms naturally lend themselves to estimating errors and confidence intervals in the computed solutions. By generating multiple random samples and observing the variability in the results, these algorithms provide insights into the uncertainty associated with the computed solution. MC algorithms for LAPs are well-suited for parallel and distributed computing environments. The inherent randomness and independence of MC samples make it possible to perform computations concurrently, improving the scalability of these algorithms for large-scale problems. MC techniques for LAPs find applications in data science and machine learning, especially in tasks involving large datasets and high-dimensional spaces. These algorithms are used for tasks such as approximating covariance matrices, solving linear regression problems, and conducting randomized dimensionality reduction. Some MC algorithms for LAEs incorporate adaptive sampling strategies, where the sampling distribution is adjusted based on the observed outcomes. This adaptability allows the algorithm to focus computational efforts on the most influential components of the problem. MC algorithms for LAPs provide a powerful and flexible approach to handling large-scale computations. Their stochastic nature and adaptability make them well-suited for addressing challenges in diverse fields, ranging from scientific computing to data-driven applications. As computational resources continue to advance, the role of MC methods in addressing complex LAPs is likely to expand further.
A comprehensive overview of these algorithms is provided in [
36]. The renowned Power method [
37] furnishes an evaluation for the main eigenvalue
, employing the well-known Rayleigh quotient In [
38], the authors delve into matrix powers’ bilinear forms, employing them to devise their solution. Both theoretical and experimental examinations delve into the algorithm’s robustness. It is elucidated that as perturbations in the entries of perfectly balanced matrices increase, both error and variance witness a corresponding escalation. Particularly, smaller matrices exhibit a heightened variance. As the power of
A increases, a rise in the relative error becomes apparent.
Iterative Monte Carlo (IMC) methods [
36,
39,
40] are a class of computational techniques that utilize iterative processes inspired by Monte Carlo (MC) simulations to solve complex problems, particularly those involving probabilistic or random elements. These methods are iterative in nature, meaning that they involve repeated cycles of computation to refine an estimate or solution. IMC methods are powerful tools that provide flexible and efficient solutions to a wide range of problems across different disciplines. Their ability to handle complex, high-dimensional problems makes them particularly valuable in situations where traditional analytical methods may be impractical or infeasible. As computational resources continue to advance, the application of Iterative Monte Carlo methods is likely to expand further, contributing to advancements in scientific research, optimization, and data-driven decision-making. IMC methods designed to solve LAEs (LAEs) can be conceptualized as truncated Markov chains [
41]:
where each state
for
represents an absorbing state in the chain.
To devise an algorithm for assessing the minimal eigenvalue by modulus, denoted as
, a matrix polynomial
is essential. Here,
represents binomial coefficients, and
q serves as an acceleration parameter [
23,
42,
43]. This approach aligns with a discrete analog of the resolvent analytical continuation method utilized in [
44]. There are instances where the polynomial converges to the resolvent matrix [
25,
36,
42]. Notably, implementing an acceleration parameter taking into account the resolvent is one avenue for reducing computational complexity. Alternatively, employing a variance reduction technique [
45] serves to achieve the required solution approximation with fewer operations. The variance reduction technique proposed in [
45] for eigenvalue calculations in particle transport utilizes MC approximations of fluxes. In [
46], an unbiased estimator for solving linear algebraic systems (LAS) is introduced. This estimator is adept at finding specific components of the solution, accompanied by comprehensive results elucidating its quality and properties. The author leverages this estimator to establish error bounds and construct confidence intervals for the solution components. Furthermore, ref. [
47] introduces a MC method for matrix inversion, rooted in the solution of simultaneous LAEs. In our subsequent exploration, insights from both [
36,
47] will be incorporated to demonstrate how the proposed algorithm can effectively approximate the inverse of a matrix.
2. Problem Formulation
Examine the subsequent LAP:
Computing elements of the solution vector, which could involve a subset of components or all components.
Assessing linear functionals of the solution.
Performing matrix inversion.
Determining the extremal eigenvalues.
Let
A and
B are matrices of size
, i.e.,
.
where
. Norms of vectors (
-norm):
and matrices
Consider a matrix
and a vector
The matrix A can be treated as a linear operator
; the linear transformation
determines a new vector in
.
Suppose that this LAE is given:
The inverse matrix task is the same as solving
n-times the task (
2), i.e.,
is the
j-th column of
.
Let the matrix
, such that
D is a diagonal matrix
is a parameter for accelerating the convergence. The system (
2) can be expressed as:
where
Let’s assume that the matrix
B exhibits diagonal dominance. However, it’s worth noting that the algorithms presented are applicable to a broader class of matrices, as demonstrated later. Specifically, if
B is diagonally dominant, certain conditions on the matrix’ elements of
A must be met:
2.1. Problem Settings
We shall consider the following problems.
Problem 1 (Evaluating the inner product).of the solution of the LAS where is a given matrix;
and are given vectors.
It is feasible to select a non-singular matrix for which , is the identity matrix and , .
It is taken into consideration that
so, every value of
for which
is met. If the criteria outlined in
are satisfied, then a
stationary linear iterative algorithm can be employed:
and the solution
x could be expressed as
known as von Neumann series.
Problem 2 (Assessing all components of the solution vector).
Here, we express the Equation (
7) with
, and likewise,
.
Problem 3 (Inverting of matrices). This consists of computation: where a specified real matrix.
It is presupposed that
Here the subsequent iterative matrix: can be defined. When (ii) are met, can be expressed as
2.2. Almost Optimal Monte Carlo Algorithm
We will employ the algorithm known as the MAO algorithm. It is an algorithm that defines Almost Optimal Method (Method Almost Optimal, (MAO)) according the definition given by I. Dimov in [
48].
Take an initial density vector , where and . Also, contemplate a matrix representing transition density , where and , for any . Establish sets of permissible densities and .
Definition 1. The initial density vector is called permissible to the vector , i.e., , if
The transition density matrix is called permissible to the matrix , i.e., , if
In this study, we will focus on densities that adhere to the defined criteria. This strategy ensures that the generated random trajectories, designed to address the problems at hand, consistently avoid encountering elements with zero values in the matrix. This method not only reduces the computational intricacies of the algorithms but also proves advantageous, particularly when handling large sparse matrices.
Suppose we have a Markov chain:
with
n states. The random trajectory (chain)
of length
k originating from the initial state
, is specified as:
Here refers to the state selected, indicating .
Assume that
where
represents the probability of the chain initiating in state
, and
is the transition probability to state
after being in state
. These probabilities
collectively form a transition matrix
P.
So our MAO WE algorithm, is configured with a distinctive selection of absorbing states, entails the presence of states denoted as .
The transition density matrix is designed in a way that , and the probability of absorption at each stage, excluding the initial one, is
.
2.3. Absorbing States Monte Carlo Algorithm
Rather than dealing with the finite random trajectory
, we examine an infinite trajectory characterized by a state coordinate
. Suppose
if the trajectory is broken (absorbed) and
otherwise. Let
Thus, equals 1 until the trajectory’s first termination, and thereafter, becomes 0.
It can be demonstrated that, given conditions (i) and (ii), these equalities hold:
5. Numerical Tests
In this section we give ample numerical simulations to demonstrate the capabilities of the proposed approaches. The algorithms are implemented and run in a MATLAB® R2021b environment, using a mobile computer with 6-core processor and 16 GB RAM.
5.1. The New Algorithm versus Deterministic Algorithms
On the graphs we will illustrate the value of
where we check the accuracy of a computed solution
, while
is the exact solution-vector. One may observe that the results for all other components exhibit similar trends.
To assess whether the weighted residual [
57,
58]:
can serve as a reliable indicator of error, we juxtapose the relative error against the residual for a matrix of dimensions
(refer to
Figure 2). It is evident [
1] that the curves representing the relative error and the residual closely align. The number of random trajectories for MC is
. It is already established in [
1] that for much large dense matrix of size
n = 25,000 the WE method is clearly better than the Jacobi method—see
Figure 3.
Here we conduct a number of numerical experiments with dense matrices of size
n = 10, 100, 1000, 10,000. A comparative analysis is conducted with well-established deterministic methods, specifically Gauss–Seidel and Jacobi algorithms. Key observations are as follows. Firstly, across all experiments, Gauss–Seidel consistently outperforms the Jacobi method by a margin of at least three orders of magnitude. Notably, this study introduces a comparison with Gauss–Seidel for the first time, as previous comparisons in [
1] exclusively involved the Jacobi method. For the matrix of size
on
Figure 4 for 20 iterations, the relative error produced by the Monte Carlo approach is about two magnitudes better than the relative error for Gauss–Seidel method and 7 magnitudes better than the Jacobi approach. For the matrix of size
on
Figure 4 after just 15 iterations, the relative error from the Monte Carlo approach is approximately six orders of magnitude better than the Gauss–Seidel method and ten orders of magnitude better than the Jacobi method.
As we increase the matrix dimension to
the new Monte Carlo approach continues to exhibit superior performance. In
Figure 4, after 15 iterations, the relative error from the Monte Carlo approach is approximately seven orders of magnitude better than that of the Gauss–Seidel method and ten orders of magnitude better than the Jacobi method. This trend becomes even more pronounced with higher matrix dimensions. The most interesting case will be very high dimensional matrix of size 10,000 × 10,000. The results for this case are presented on
Figure 4. Here, just for 14 iterations, the Monte Carlo method achieves a relative error of
, while for the same number iterations Gauss–Seidel achieves
and Jacobi produces
. Consequently, for very high dimensions, the developed Monte Carlo approach emerges as the most accurate, surpassing other methods by a significant margin across multiple orders of magnitude.
We conduct also a range of computational experiments involving dense matrices with dimensions
. These experiments are compared against well-established deterministic algorithms, specifically the Gauss–Seidel and Jacobi methods, to facilitate a thorough analysis. Notably, the results reveal consistent superiority in accuracy for the Gauss–Seidel algorithm compared to the Jacobi method, with a minimum difference of three orders of magnitude in each computational test. This juxtaposition with Gauss–Seidel is a distinctive contribution, as prior studies such as [
1] exclusively focused on comparing with the Jacobi method. Taking the example of a
matrix (illustrated in
Figure 5) after 15 iterations, our Monte Carlo approach demonstrates a relative error approximately five orders of magnitude lower than Gauss–Seidel and eight orders of magnitude lower than Jacobi.
Examining a larger matrix with dimensions
(illustrated in
Figure 5) and restricting the iterations to just 15, we observe that the Monte Carlo approach outperforms Gauss–Seidel by roughly six orders of magnitude and surpasses the Jacobi method by approximately ten orders of magnitude in relative error.
As we further elevate the matrix size to
, the performance differential becomes even more striking. With just 15 iterations (see
Figure 5), the Monte Carlo method demonstrates a relative error that is seven orders of magnitude superior to Gauss–Seidel and ten orders of magnitude better than Jacobi. Hence, we can affirm that the performance disparity between our Monte Carlo approach and the Gauss–Seidel algorithm expands as we increase the matrix dimensions.
The apex of our analysis is focused on the colossal
matrix, the outcomes for which are illustrated in
Figure 5. In a mere 14 iterations, the Monte Carlo methodology achieved an impressively low relative error of
. Comparatively, Gauss–Seidel registered a relative error of
and Jacobi languished further behind with
. Hence, for exceedingly large-scale problems, our developed Monte Carlo approach stands out, surpassing other algorithms by a substantial margin in terms of accuracy.
5.2. WE versus PCG
We evaluate our outcomes against the optimal deterministic preconditioned conjugate gradient (PCG) method [
57,
59,
60].
Here’s more detailed information regarding our PCG method implementation. Our objective is to address a linear system of equations, , utilizing the PCG iterative method. The matrix B in our context is the square and sizable matrix from the widely recognized Harwell-Boeing collection, with the requirement that B is symmetric and positive definite. The NOS4 matrix is a notable example from this collection, primarily because of its distinct characteristics and the challenges it presents in computational processing.
The NOS4 matrix is a sparse matrix, a common type found in the Harley-Boeing collection, known for its large size and sparsity. This matrix is particularly representative of structural engineering problems, often derived from finite element analysis. The specific features of the NOS4 matrix are as follows.
The NOS4 matrix exhibits a highly sparse structure, which is characteristic of matrices derived from finite element methods in engineering. This sparsity is a significant factor in computational processing, as it requires specialized algorithms that can efficiently handle non-dense data distributions.
The matrix is relatively large, which adds to the computational challenge. Its size and the complexity of its elements make it an ideal candidate for testing the scalability and efficiency of our algorithm.
The matrix typically contains numerical values that represent physical properties in structural engineering contexts, such as stiffness and material properties. These values are crucial for accurately simulating and analyzing structural behavior under various conditions.
Specifically, in the context of our study, the NOS4 matrix is utilized to model problems in constructive mechanics. This involves simulations that are critical for understanding and predicting the behavior of structures, particularly in the design and analysis of buildings, bridges, and other infrastructural elements.
In our implementation, the parameters employed are as follows: b represents the right-hand side vector; stands for the required relative tolerance for the residual error, denoted as , signifying that the iteration halts if ; denotes the maximum permissible number of iterations; corresponds to the (left) preconditioning matrix, ensuring that the iteration is theoretically equivalent to solving by PCG , with ; and signifies the initial guess.
Some detailed information about heaviest diagonals and conditioning for the matrix NOS4 is given on
Figure 6. The illustration of the PCG for NOS4 is given on
Figure 7. Our approach, devoid of such an optimization process, yields superior results. It is noteworthy that the specific matrix (NOS4) appears to pose a challenging test for PCG, but generalizing this observation is complex, as PCG success depends on the specific functional minimized in the method. Further analysis is required to determine the conditions under which the IWE and WE Monte Carlo methods outperform the widely used PCG.
Figure 8 depicts the convergence analysis of the WE Monte Carlo and PCG methods when applied to the matrix
This matrix originates from an application involving the finite element approximation of a problem associated with a beam structure in structural mechanics [
61].
The outcomes depicted in
Figure 8 indicate that the WE Monte Carlo exhibits superior convergence. The curve representing the residual with the number of iterations is smoother, reaching values as low as
or
compared to the PCG curve, which achieves an accuracy of
. It is important to emphasize that, for the experiments showcased, the desired accuracy is specified as
. One of the most significant distinctions between the Preconditioned Conjugate Gradient (PCG) method and our Walk on Equations (WE) Monte Carlo algorithm lies in their respective operational mechanisms within Krylov subspaces. While both methods operate within these optimal subspaces, the approach each takes is fundamentally different, leading to a crucial advantage for the WE algorithm.
The PCG method is fundamentally an optimization-based approach. It seeks to minimize the residual of the solution iteratively within the Krylov subspace. This process, while effective under certain conditions, inherently runs the risk of becoming ensnared in local minima. These local minima, which are close but not equivalent to the global minimum, can significantly hinder the convergence of the PCG method towards the most accurate solution. This susceptibility is illustrated in
Figure 8, where the PCG method’s trajectory towards convergence is impeded by the presence of a local minimum.
In contrast, our WE algorithm operates within the same Krylov subspaces but diverges from the PCG method by eschewing the optimization paradigm entirely. Instead, it leverages the probabilistic nature of Monte Carlo methods to traverse the solution space. This approach inherently bypasses the pitfalls of optimization-based methods, such as getting trapped in local minima. The WE algorithm, by not being constrained to follow the gradient descent pathway that optimization methods inherently use, is not prone to stalling or slowing in areas that only represent local solutions. This attribute allows the WE algorithm to explore the solution space more freely and thoroughly, increasing the likelihood of reaching the global solution without the hindrance of local minima.
Therefore, the fundamental difference in operational philosophy between the PCG and WE methods not only delineates their theoretical underpinnings but also translates into a practical advantage for the WE algorithm. By navigating the solution space without the constraints of an optimization problem, the WE algorithm demonstrates a robustness and efficiency in finding the global solution, particularly in complex linear algebraic systems where local minima can be a significant obstacle.
5.3. Testing the Balancing
It is noteworthy that matrix balancing is a crucial concern for deterministic algorithms. Consequently, an arbitrary matrix necessitates a balancing procedure as part of the preprocessing steps. While MC algorithms are generally less sensitive to balancing, balancing does influence the convergence of the WE MC algorithm. As it is already established in [
1], in the first instance, illustrated in
Figure 9 (top left), we examine two small matrices with size
. The balanced matrix is characterized by
, resulting in a parameter
a of
. On the other hand, the unbalanced matrix has
, with an extreme level of imbalance (the ratio between the largest and smallest elements by magnitude is 40). It is worth noting that a ratio is considered large when it reaches 5.
Certainly, for the unbalanced matrix, convergence is entirely absent. In contrast, the balanced matrix exhibits rapid convergence, albeit with a somewhat erratic
red curve, attributable to the limited number of random trajectories. To explore this further, we revisited the same matrices, this time augmenting the number of random trajectories to
. Concurrently, we reduced the number of random trajectories (and consequently the computational time) for the balanced matrix from
to
. The outcomes of this experiment are illustrated in
Figure 9 (top right).
The results clearly indicate that the WE algorithm initiates convergence, reaching three correct digits after approximately 35 to 40 iterations. While the balanced matrix still demonstrates swift convergence, the red curve retains its roughness, reflective of the limited number of random trajectories.
For randomly generated matrices of size
, akin results are presented in
Figure 9 (bottom left). The unbalanced matrix results from random perturbations to the elements of the balanced matrix. Both the
green and
blue curves pertain to the same unbalanced matrix, with the latter achieved by leveraging ten times more random trajectories. This demonstrates that increasing the number
N of random trajectories can enhance convergence, aligning with theoretical expectations.
In our specific numerical experiments depicted in
Figure 9 (bottom left), the green curve is derived from a set of trajectories with
, while the blue curve is based on
trajectories. It is evident that, in the former case, the algorithm fails to converge, whereas in the latter case, convergence is achieved. These experiments underscore that the WE Monte Carlo exhibits an additional degree of freedom, absent in deterministic algorithms, allowing for enhanced convergence even with severely imbalanced matrices. However, this improvement in convergence comes at the cost of increased complexity, or computational time, due to the augmented number of simulation trajectories.
To explore the impact of matrix imbalance on the convergence of the WE Monte Carlo, we conducted a noteworthy experiment. By randomly generating matrices of varying sizes, we introduced random perturbations to the matrix elements, progressively increasing the matrix’s imbalance. The results for a matrix of size
are illustrated in
Figure 9 (bottom right). The matrix’s randomly generated entries were additionally perturbed by
and
.
Matrix balancing plays a pivotal role in the performance of many computational algorithms, particularly in solving linear algebraic systems. While MC algorithms, including our WE MC algorithm, are generally less sensitive to matrix balancing compared to deterministic algorithms, the impact of balancing on convergence is non-negligible.
To provide a concrete context for the necessity of matrix balancing, let us consider a practical example from structural engineering. In finite element analysis, matrices often represent stiffness or connectivity matrices of a physical structure, like a bridge or building. These matrices can become imbalanced due to variations in material properties or non-uniform loading conditions. An imbalanced matrix in such scenarios could lead to inaccurate or inefficient computational analysis, impacting the reliability of the structural assessment.
Given the practical implications, it is vital to establish criteria for when and how to apply matrix balancing:
Magnitude of Element Variation: Balancing should be considered when there is a significant disparity in the magnitude of matrix elements. A practical threshold could be set; for instance, a ratio of 5 or more between the largest and smallest elements by magnitude indicates a need for balancing.
Application-Specific Needs: In fields such as structural engineering or fluid dynamics, where matrices represent physical phenomena, the need for balancing should be assessed based on the physical realism of the matrix. If imbalances in the matrix do not correspond to realistic physical conditions, balancing may be necessary.
Computational Efficiency: If preliminary runs of the algorithm show a slow convergence rate, this could signal the need for balancing, especially in large matrices where computational efficiency is paramount.
5.4. Comparison between the Three WE Algorithms
This subsection is very important, because it will address the issue in which cases the new algorithm is better than the old algorithm.
We consider randomly chosen matrix
and
with values of the dominancy parameter
. While in the case of lower
the two algorithms produce similar results with the new algorithm being better at high sequential steps, for higher
the advantage of the new algorithm can be clearly seen—see
Table 2. The apex of our analysis is focused on the
matrix, the outcomes for which are illustrated in
Table 2. In a mere 5 iterations, the new Monte Carlo algorithm achieved an impressively low relative error of
. Comparatively, the old WE registered a relative error of
. Hence, for exceedingly large-scale problems, our developed Monte Carlo approach stands out, surpassing the old algorithm by a substantial margin in terms of accuracy.
We can see the comparison between the three algorithms on
Figure 10. The corresponding dominancy numbers are
. We can observe similar behavior between the three algorithms with the slight advantage to the hybrid and the new WE algorithm due to the fact that the dominancy number
is closer to
. When we increase the dominancy number the advantage of the new algorithm versus the old algorithm become more evident—see the results on
Figure 11. The corresponding dominancy numbers are
. When we augment the dominancy number, the superiority of the new algorithm over the old algorithm becomes increasingly pronounced. As the dominancy number, which characterizes the influence of the diagonal elements within a system, grows, the effectiveness and efficiency of the new algorithm become more evident in comparison to the older method. This trend suggests that the performance gap between the two algorithms widens as the system’s key features gain greater prominence, underscoring the enhanced adaptability and efficacy of the new algorithm in handling scenarios with elevated dominancy numbers. This observation reinforces the notion that the advantages of the new algorithm are particularly prominent and advantageous in situations where dominant factors play a more significant role. When we decrease the dominancy number the advantage of the hybrid algorithm versus the old and new algorithm become more evident—see the results on
Figure 12. The corresponding dominancy numbers are
. It is worth mentioning that in case of low dominancy numbers the difference between the old and the new algorithm is very similar. This fact could be explained by the proof of Theorem 7, but needs more careful analysis and will be an object of a future study.
Here we describe the significance of the dominancy number and its impact on the performance of our new Monte Carlo algorithm compared to traditional methods. To provide a clearer understanding of how the dominancy number relates to real-world systems, we will establish a link between the bands of this parameter and their applications in various scenarios.
5.4.1. Dominancy Number in Real Systems
1. Structural Engineering and Mechanics: In structural engineering, matrices derived from finite element analysis often have dominancy numbers reflecting the stiffness of the structure. A higher dominancy number (closer to 1) might indicate a structure with more uniform material properties or loading conditions, whereas a lower dominancy number (closer to 0) could represent a structure with highly variable stiffness or irregular load distribution. For instance, in a uniformly loaded beam, the stiffness matrix would likely exhibit a higher dominancy number, indicating the potential effectiveness of our new algorithm in such scenarios.
2. Electrical Networks and Circuit Analysis: In electrical engineering, matrices representing circuit networks often have dominancy numbers that reflect the distribution of resistances or impedances. A circuit with evenly distributed resistances across components may present a higher dominancy number, suggesting a system where our new algorithm could be particularly efficient.
3. Fluid Dynamics and Heat Transfer: Systems modeled for fluid dynamics or heat transfer can have matrices with varying dominancy numbers depending on the homogeneity of the medium or the boundary conditions. For example, a heat transfer problem in a homogenous medium might have a higher dominancy number, indicating the suitability of our new algorithm for such cases.
5.4.2. Practical Implications of Dominancy Number Bands
- -
High Dominancy Band ( ≈ 0.6–1.0): Systems with high dominancy numbers are likely to benefit more from our new algorithm. These systems typically exhibit more uniform properties or conditions, making our algorithm more efficient in handling them.
- -
Medium Dominancy Band ( ≈ 0.4–0.6): In this range, the performance of our new algorithm is comparable to traditional methods, with slight advantages in certain scenarios. Systems in this band often exhibit moderate variability in their properties.
- -
Low Dominancy Band ( ≈ 0.0–0.4): Systems with low dominancy numbers might not exhibit a significant performance difference between our new algorithm and traditional methods. These systems often represent highly variable or irregular conditions.
The dominancy number is a critical parameter in determining the suitability and efficiency of our new Monte Carlo algorithm for various real-world systems. By understanding the implications of different dominancy number bands, practitioners can better assess the applicability of our algorithm to specific scenarios, ranging from structural engineering to electrical networks.
6. Discussion
In this section we will discuss the most important features of the new algorithm. We will discuss its advantages over the original algorithm. As can be seen the advantage of the new algorithm versus the old one depends on the newly presented parameter in this paper—the so called dominancy number.
We may observe the following key elements of the presented algorithms.
Performance enhancement through strategic relaxation parameter selection, leading to reduced computation time and relative errors within a fixed number of iterations.
Optimization potential by balancing the iteration matrix, demonstrating improvements in the algorithm’s original form.
The introduced random variable serves as an unbiased estimator for the solution vector. The stochastic error’s structure is analyzed in detail.
The analysis of the stochastic error structure reveals that the convergence rate of the proposed WE Monte Carlo algorithm is influenced by matrix balancing. Although this influence is less pronounced compared to deterministic algorithms, it is still a factor to consider.
Application of a sequential Monte Carlo method by John Halton, based on the control variate method, for further optimization.
Analysis of the relative stochastic error structure, revealing a dependency on matrix balancing for the convergence rate of the WE algorithm.
Comparative analysis of numerical outcomes with the most optimal deterministic method, the Preconditioned Conjugate Gradient (PCG), indicating superior results for the WE Monte Carlo method in certain cases.
The proposed WE Monte Carlo algorithm, in conjunction with the sequential Monte Carlo for computing all solution components, demonstrates significantly faster convergence compared to certain well-known deterministic iterative methods, such as the Jacobi and Gauss–Seidel methods.
The proposed WE Monte Carlo algorithm, in conjunction with the sequential Monte Carlo for computing all solution components, shows faster convergence compared to the old algorithm when the dominancy number is close to 1. This trend is consistent across matrices of various sizes, with a more pronounced effect observed for larger matrices.
Now, when we consider the impact of the dominancy number on the performance of algorithms, particularly the comparison between a new and an old algorithm, the following dynamics come into play:
The dominancy number reflects the diagonally dominance in the system. As this number increases, it implies that specific elements or features have a more substantial influence on the overall system behavior.
The old algorithm might have been designed or optimized based on certain assumptions or characteristics of the problem. However, if the dominancy number is low, the impact of the dominance may not have been fully exploited or addressed by the old algorithm.
The new algorithm, on the other hand, may have been specifically developed or adapted to handle scenarios with higher dominancy numbers. It could incorporate techniques that capitalize on the influence of diagonally dominance. As the dominancy number increases, the advantages of these tailored approaches become more apparent.
The observed performance difference between the old and new algorithms is indicative of the latter’s adaptability and efficacy in scenarios where certain factors play a more critical role. The new algorithm may dynamically adjust or scale its strategies based on the prominence of diagonally dominance, leading to improved outcomes in these situations.
In essence, the correlation between the dominancy number and algorithmic performance emphasizes the importance of algorithmic design that considers and leverages the influential factors within a given problem. As the dominancy number grows, the advantages of the new and the hybrid algorithm specifically crafted to handle such conditions become increasingly evident, showcasing the importance of adaptability and specialization in algorithmic solutions.
However, there is another issue to be addressed. The need for increased simulation trajectories to achieve convergence, especially when dealing with imbalanced matrices, raises questions about efficiency and practicality, particularly in comparison to canonical methods.
To address this critical point, it is possible to apply a number of strategies to achieve a balance between convergence speed and the number of simulation trajectories:
Adaptive Trajectory Sampling: By implementing an adaptive sampling technique, the algorithm can dynamically adjust the number of trajectories based on the convergence rate. When the algorithm is far from convergence, fewer trajectories can be used. As it approaches the solution, the algorithm can increase the number of trajectories to fine-tune the accuracy. This adaptive approach helps in reducing unnecessary computational overhead in the early stages of the algorithm.
Parallel Computing and Optimization: The inherently parallel nature of Monte Carlo simulations makes our WE algorithm particularly amenable to parallel computing techniques. By distributing the simulation trajectories across multiple processors or a GPU, we can significantly reduce the overall CPU time. Furthermore, optimizing the code for specific hardware architectures can also lead to substantial improvements in computational efficiency.
Hybrid Methodologies: Combining the WE algorithm with deterministic methods can offer a compromise between the robustness of Monte Carlo approaches and the speed of traditional methods. Starting with a deterministic method to reach a closer approximation to the solution and then switching to the WE algorithm for fine-tuning could reduce the total number of simulation trajectories required.
Improving the Dominancy Number Calculation: As our research highlights the importance of the dominancy number in the algorithm’s performance, refining the calculation or estimation of this number can lead to a more efficient trajectory planning. This could involve developing predictive models that estimate the optimal dominancy number based on matrix characteristics.
Advanced Stochastic Techniques: Incorporating advanced stochastic techniques, such as importance sampling or variance reduction strategies, could improve the efficiency of the simulation trajectories. These techniques aim to focus computational effort on the most critical parts of the solution space, thereby reducing the number of trajectories required for a given level of accuracy.
To summarize, while the increase in CPU time due to additional simulation trajectories is a valid concern, especially for imbalanced matrices, the strategies outlined above provide viable pathways to optimize the trade-off between convergence speed and computational effort. Our future work will focus on further investigating and implementing these strategies to enhance the practical applicability of the WE algorithm.
7. Conclusions
Linear systems hold significance not just for their mathematical elegance but for their pervasive influence across various disciplines, serving as a foundational language to articulate and comprehend the complexities of the world. Linear Algebraic Equations are more than mere problems to solve; they act as a vital link between abstract theories and practical realities in scientific and engineering applications. Whether navigating simulations in computational fluid dynamics or optimizing structures in engineering, they offer a mathematical framework to perceive, understand, and tackle intricate challenges. As computational capabilities expand, addressing large-scale linear algebraic problems has become a focal point, pushing the boundaries of both analytical techniques and computational hardware.
In our research, we have introduced and thoroughly investigated a novel Monte Carlo method designed explicitly for large systems of algebraic linear equations. This innovative approach is based on the “Walk on Equations” Monte Carlo model, a recent development attributed to Ivan Dimov, Sylvain Maire, and Jean Michel Sellier. Additionally, we present a new hybrid algorithm, a combination of the two methods, introducing a crucial element—the dominancy number—for algorithmic performance. Notably, our study breaks new ground by providing a comparative analysis with the Jacobi and Gauss–Seidel algorithms, focusing on matrices of considerable dimensions—an aspect largely unexplored in existing scholarly discussions.
To validate our approach, we conduct numerical experiments, including a significant case involving a large system from finite element approximation in structural mechanics, specifically addressing a beam structure problem. Our work contributes to advancing the understanding and effective solution of large-scale linear algebraic problems, showcasing the relevance and potential applications of our proposed Monte Carlo method in practical, real-world scenarios.
Looking ahead to the future landscape of scholarly endeavors in this field, we anticipate a comprehensive and rigorous evaluation that juxtaposes traditional computational methods with our cutting-edge Monte Carlo technique. Moreover, we need to investigate how the proposed method change to nonlinearities, and what possible criteria can be done in this case. Additionally, we aim to broaden the scope of our innovative algorithm by applying it to various matrices drawn from the well-known Harwell-Boeing collection. This initiative seeks to underscore the adaptability and effectiveness of our approach in solving linear algebraic equations crucial to a variety of practical, real-world scenarios.