Discrete Event Systems Theory for Fast Stochastic Simulation via Tree Expansion

: Paratemporal methods based on tree expansion have proven to be e ﬀ ective in e ﬃ ciently generating the trajectories of stochastic systems. However, combinatorial explosion of branching arising from multiple choice points presents a major hurdle that must be overcome to implement such techniques. In this paper, we tackle this scalability problem by developing a systems theory-based framework covering both conventional and proposed tree expansion algorithms for speeding up discrete event system stochastic simulations while preserving the desired accuracy. An example is discussed to illustrate the tree expansion framework in which a discrete event system speci ﬁ cation (DEVS) Markov stochastic model takes the form of a tree isomorphic to a free monoid over the branching alphabet. We derive the computation times for baseline, non-merging, and merging tree expansion algorithms to compute the distribution of output values at any given depth. The results show the remarkable reduction from exponential to polynomial dependence on depth e ﬀ ectuated by node merging. We relate these results to the similarly reduced computation time of binomial coe ﬃ cients underlying Pascal’s triangle. Finally, we discuss the application of tree expansion to estimating temporal distributions in stochastic simulations involving serial and parallel compositions with potential real-world use cases.


Introduction
Stochastic simulations require large amounts of time to generate enough trajectories to attain statistical significance and estimate desired performance indices with satisfactory accuracy [1][2][3].Complex problems such as climate change mitigation, network design, and command and control decision support require search spaces with deep uncertainty arising from inadequate or incomplete information about the system and the outcomes of interest [4][5][6][7][8][9][10].Furthermore, simulation models for system engineering analyses present challenges to today's computational technologies.First, questions addressed at the Systems of Systems (SoS) level require large detailed models to provide sufficient representation of relevant system-to-system interactions of stochastic nature.Second, they also require multiple executions with multiple random seed state initiations to cover the wide range of configurations necessary to obtain statistically significant measurement of performance outcome distributions.
Surrogate models, i.e., simplified models which drastically reduce computation while providing useful guidance, have successfully helped find global optima of computationally expensive optimization problems for real-world applications [11][12][13][14][15][16].The methods using such models are often referred to as multifidelity/multilevel/variable-fidelity optimization.We note that the term fidelity is often employed to refer to ambiguous combinations of resolution and accuracy [17,18].A generic framework was defined [19] in which models of different accuracy and computation costs are selected algorithmically to reduce the overall computational cost while preserving the accuracy of the simulation analysis.While such frameworks exist, they do not by themselves provide the surrogate models or more generally methods for speeding up the simulation of stochastic systems to support more timely systems analysis and optimization [20][21][22][23][24][25].
The parallel execution of simulations offers another avenue for the speedup of complex simulations.Unfortunately, exploitation of the parallelization of simulation models for generic system engineering analyses presents challenges to today's computational technologies [26,27].Although technological advances at the hardware level will enable more and faster processors to handle such simulations, with cloud services adding access to additional resources, such computational support is destined to reach its limit.Therefore, the imperative remains to formulate parallelization in more model-centric ways.Paratemporal and cloning simulation techniques have been introduced that increase opportunities for parallelism [28,29].They also exploit abstractions that recognize the effect of random draws on system evolution as constituting choice points with opportunities for reuse [30].However, scalability, the ability to overcome the combinatorial explosion of branching arising from multiple choice points, presents a major hurdle that must be overcome to implement such techniques.
Nutaro et al. [31] examined the use of tree expansion methods in lieu of the conventional sampling of outcomes when working with the uncertainty inherent in stochastic simulations.Conventional techniques simulate a large number of randomly sampled trajectories from start to finish in one-at-a-time fashion.As illustrated on the left side of Figure 1, such trajectories can be viewed as paths from an initial state of the stochastic system (the root of the tree) to one of the terminal states, leaves of the tree, in a manner consistent with random sampling.The advantage of tree construction is that states of the model that have been reached at any point-nodes of the tree-can be cloned for reuse, thus avoiding duplication.Branches in the tree from a state correspond to draws of the random variable whose values determine the subsequent course of the model from that state.
In particular, tree expansion with breadth-first traversal can significantly speed up the computation required to generate the same sampling outcomes as the one-at-a-time technique [31].However, the speedup is limited by the exponential growth of the tree with increasing depth.Zeigler et al. [32] introduced merging of states based on homomorphism concepts to mitigate against such growth.As illustrated on the right-hand side of Figure 1, they showed that such merging can reduce tree growth from exponential to polynomial in depth, thus significantly speeding up computation over that possible with cloning alone.Parallelizations of such simulations have been developed in which simulations are run until a stochastic decision point is reached.At this point, the current simulation states and probabilities of branching are saved.Simulations are then spawned for possible branchings until successive downstream branching points are reached, and the process is repeated until a satisfactory level of confidence in outcome distribution has been attained.Besides being efficient in exploration, this "paratemporal" approach is extremely parallelizable for great efficiency in execution on multiple processors.
However, although paratemporal simulations with a small number of branchings and state saves have been demonstrated to be effective, simple implementations of such solutions do not scale as the number of branches increases rapidly for large SoS of current interest.
In this paper, we tackle the scalability problem by first developing a formal framework covering conventional and proposed tree expansion algorithms for speeding up stochastic simulations while preserving the desired accuracy.Based on the theory of modeling and simulation, we review the definition of a discrete event stochastic model as an instance of a timed non-deterministic model.Then, we show how a reduced deterministic model with random inputs can be derived from such a stochastic model that represents the results of cloning state and transition information at branching points.The reduced model is shown to be a homomorphic image of the original based on a correspondence restricted to non-deterministic states and multi-step deterministic sequences mapped into corresponding single-step sequences.An example is discussed to illustrate the tree expansion framework in which the stochastic model takes the form of a binary tree isomorphic to the free monoid, {0,1}*.At each node, branching occurs with equal probability to nodes at the next level.A computation time of 1 unit is taken to transition from a node to its successor.The output at a state is the number of 1′s in its label.We derive the computation times for baseline, non-merging, and merging tree expansion algorithms to compute the distribution of output values at any given depth.The results show the remarkable reduction from exponential to polynomial dependence on depth effectuated by node merging.We relate these results to the reduced computation of binomial coefficients underlying Pascal's triangle and discuss the application of tree expansion to estimating temporal distributions in stochastic simulations involving serial and parallel compositions.Finally, we mention use cases estimating times to completion for complex processes and potential real-world applications.

Formal Framework for Tree Expansion for Stochastic Simulation
We employ the theory of modeling and simulation [33], based on systems theory [34], to develop a formal framework based on Discrete Event Systems Specification (DEVS) for framing the representations needed for paratemporal simulations.As in Figure 2, to capture the effect of cloning on the source stochastic simulation, we derive other representations including concepts of non-deterministic models and semigroup monoid algebras.

Definitions
We review some definitions to proceed.

Definition 1. A timed non-deterministic model is defined by M = <S, δ, ta>, where δ ⊆ S × S is the non-deterministic transition relation and ta: δ→R ∞ 0 is the time advance function.
We say that M is as follows: • Not defined at a state, if there is no transition pair with the state as its left member;

•
Non-deterministic at a state, if the state is a left member of two transition pairs; • Deterministic at a state when there is exactly one outbound transition (a left member of exactly one transition pair).
Remark: Formulating a transition system in relational form as in Definition 1 allows us to include both stochastic and deterministic discrete event systems within a common framework as follows: A stochastic model is a timed non-deterministic model defined in all of its states.A deterministic model is a timed non-deterministic model (Figure 3) deterministic in all its states.
Clearly, deterministic models are a subset of stochastic models.
In application to paratemporal simulation, a non-deterministic state is known as a random draw state.We make this identification in a later section after introducing DEVS Markov models.

Definition 2. A state trajectory connecting a pair of states, s and s', is a sequence s1, s2,…, sn which starts with s and ends with s' and satisfies the transition relation, i.e.
, where s1 = s, sn = s' and δ(si,si+1) for i = 1,…,n−1.

Definition 3. A deterministic state trajectory is a state trajectory containing only deterministic states. The time to traverse a deterministic state trajectory is the sum of the transition times associated with the successive pairs of states in its sequence.
We can remove deterministic states from a stochastic model and replace multi-step deterministic trajectories with single-step trajectories to represent the effect of cloning simulations.Given a stochastic model, M = <S, δ, ta>, we define a reduced version that contracts deterministic sequences into single-step transitions: We can prove the following.

Assertion 1
The reduced model is a homomorphic image of the original based on a correspondence restricted to non-deterministic states and multi-step deterministic sequences mapped into corresponding single-step sequences (Figure 4).
We note that the transversal time from any non-deterministic state to any other is preserved in the reduced version.However, the advantage of constructing this representation is that the computation (in simulation) of a multi-step sequence can be replaced by a look up of a table (cloning) when the branching is encountered subsequently.where Y, S, λ, and ta have the usual definitions [35].
Here Gint: S→2 S is a function that assigns a collection of sets Gint (s) ⊆ 2 S to every state s.
The probability that the internal transition carries a state s to a set G ∈ Gint (s) is given by a function Pint (s,G).
For S finite, we let where Pr(s, s') is the probability of transitioning from s to s'.As defined in [33], the key to formalizing a semi-Markov model in DEVS is the definition of two structures.One corresponds to the usual matrix of probabilities for state transitions.The other assigns to each transition pair a probability density distribution over time.The choice of the next phase in the DEVS is made first by sampling the first matrix.Then, the transition from the current phase to the just-selected next phase is given a time of transition by sampling the distribution associated with that transition.More formally, this is formulated by the following: Definition 6.A pair of structures, probability transition structure, PTS = <S, Pr>, and time transition structure, TTS = <S, τ>, gives rise to an input-free DEVS Markov model [33].MDEVS = <Y,SDEVS, δint, λ, ta>, where SDEV S = S × [0, 1] S × [0, 1] S , with typical element (s, γ1,γ2) with γi: S → [0, 1], i = 1, 2, where, δint: SDEV S → SDEV S is given by δint (s, γ1,γ2) = s'= (SelectPhase Gint (s, γ1), γ1′, γ2′) and ta: SDEV S → R + 0,∞ is given by ta(s,γ1,γ2) = SelectSigma TT S (s, s',γ2) and γi' = Γ (γi), i = 1, 2.
The input-free DEVS Markov model is introduced as a concrete implementation for non-deterministic models.On the one hand, such models are constructible in computational form in such environments as MS4 Me [36].On the other hand, we can explicitly define how such models give rise to non-deterministic models as in the following:
Essentially, this assertion shows how a transition from state s1 to s2 is possible if there is a random selection of s2 from the set of possible next states of s1 (as determined by the seed γ1), and the time for such a transition is given by a sampling from the distribution for traversal times determined by the seed γ2.

Illustrative Example
To create the models needed to illustrate the framework of Figure 1, consider a binary tree of depth 3 with root labelled by the empty string and each node labelled by a string of 0's and 1's corresponding to the path to it from the root.At each node, a choice is made to select the successor to which to transition with equal probability.A computation time of 1 unit is taken to transition from a node to its successor.The output at a state is the number of 1's in its label.We write the formal representation as a DEVS Markov model as follows: MDEV S = <Y, SDEV S, δint, λ, ta> Y = {0, 1, 2, 3} S = {0, 1, 00, 01, 10, 11, 000, 001,…111} (nodes in a binary tree of depth 3 labelled by strings corresponding to the path accessing them).SDEV S is the set of pairs of the form (s, γ) where s is a member of S (a node) and γ is a state of an ideal random number generator, such that Γ (γ) is the next state of the random generator (for simplicity, the time advance will be constant so an additional random variable is not needed).
Gint (s) = {s0,s1}, the subset of nodes that are immediate successors of node s in the binary tree and where SelectPhase Gint (s, γ) uses the random number state γ to select the successor node from Gint (s) with equal probability.

ta(s,γ) = 1
And λ(s) is the number of 1's in the label of s.
The desired outcome of the simulation is the average of the values of the nodes.
To illustrate the applicability of homomorphism and the minimal realization concept, we will show how it provides insight into the merging of states for tree expansion.
We will compare the following three algorithms: 1.The baseline algorithm generates all trajectories one at a time, accumulates the number of 1's for each, and averages the results.2. The tree expansion algorithm generates all nodes in breadth-first traversal without repetition, obtaining the same information and performing the same average.To analyze this example, we will start with the semigroup monoid system (as in the Appendix A) unfolded to the depth = 4, as shown in Figure 5.The leaves of the tree are labelled with the final states listed above, and traversing the tree from root to leaves reveals 16 branching routes.We will proceed to obtain various computation times generalized to the case of a tree of arbitrary depth n, as presented in Table 1.The computation time for the baseline algorithm is computed as the number of branching routes (2 n ) times the depth (n) taking unit computation time per step.The smallest number of computations when reusing earlier nodes is to expand the tree starting at the root and proceeding to expand new leaves successively at each level in breadth-first traversal.This requires 2(2 n −1) computations, as shown in the table.Therefore, the reduction in computation time is n2 n /2(2 n −1) which is of order O(n).The minimal realization of this model, shown in Figure 6 to depth = 3, recognizes that all nodes labeled (x1,x2,x3) with the same sum (x1 + x2 + x3) can be grouped into the same congruence class.The justification of congruence is easy to see in this case: the 0 input keeps such an element in the same class, while the 1 transitions it to the class having a sum increased by 1.With the merging of nodes, the tree expands in a manner emulating Pascal's triangle for computing binomial coefficients [37].At any depth n, there are a total of 2 n subsets of sizes ranging from 0 to n with cardinalities given by the binomial coefficients.For example, at depth 3, there are 1,3,3,1 subsets of sizes 0,1,2,3, respectively.With the size of a subset representing the number of 1's in the same equivalence class, we see that the average number of 1's will equal n/2, as expected.Importantly, the number of nodes, and hence the computation time, grows only as the square of n rather than exponentially, as shown in Table 1.The computation time is now O(n 2 ) (square) in the depth of the tree, and the reduction is of exponential order-which would be exceedingly impressive to achieve in real-world application! .

Empirical Confirmation of Theory Predictions
To test the theory and its implementation, the illustrative example was implemented in Java and executed to compare computation times with those predicted from Table 1.
Figure 7 shows that the relative computation times of the merged, unmerged, and baseline algorithms fall in the order of that predicted in Table 1.However, Figure 8 shows that the actual speedup realized by merging relative to the baseline is approximately 50 times less than predicted.Nevertheless, the speedup achieved at depth 22 of approximately 7000 is still highly significant.Note: the baseline algorithm exceeds the memory available at depth 23 due to the exponential node growth.Trees up to depth 1000 with 10 replications for each were tested to obtain the timing results, and all yielded the correct outcome distribution.

Tree Expansion with Merging Applied to Serial and Parallel Compositions
Serial and Parallel compositions of simple DEVS Markov models illustrate how merging in tree expansion can work in a large class of stochastic systems to greatly control node generation and computation time.To succeed, a serial composition entails the success of each of its components; likewise, it fails if any one of its components fails [38].In contrast, the success of a parallel composition requires that only one of its components succeeds, while its failure entails the failure of all the components.
The component models in the compositions are of the form of a DEVS Markov model (not including input and output), as shown in Figures 9 and 10.
The transition structure of the model is specified by two parameters, PSucceed, the probability of success, and succeed, the probability distribution for the time required to achieve success.These populate the values of the probability and the time transition structures needed for the definition of the model, which is given as follows: MDEV S = <Y, SDEV S, δint, λ, ta> Y = {Succeed,Fail} S = {Start, Succeed, Fail} SDEV S is the set of triples of the form (s, γ,) where s is a member of S (a node) and γ,  are states of ideal random number generators for selecting between the transitions from Start to Succeed or Fail and for selecting the time distribution for the transition to Succeed, if selected.The time for the transition to Fail is not of interest here so no random seed is associated with it.
Gint (Start) = {Succeed,Fail} the subset of nodes that are immediate successors of node Start and s'= δint (Start, γ,) = (SelectPhase Gint (Start, γ), Γ (γ),) where the latter uses the random number seed γ to select Succeed with probability PSucceed and Fail with probability 1 − PSucceed.ta(Succeed, γ,) = (SelectSigma Gint (Start, γ, Γ()) which selects the time required to succeed from the given distribution, succeed, and λ is the time selected for the time advance.
Serial and parallel compositions in Figures 9 and 10 are defined using the standard DEVS coupled model specifications [33].As illustrated in Figure 11, the temporal behaviors of the compositions are directly derived from those of the components and the coupling specified by the respective composition.In the serial composition, the Activate input is shown setting the first (top) model, M into state Succeed with an output after time ta.The latter is determined by the random sampling process just described and shown as a threshold value for the elapsed time, e to achieve.This output, Y is coupled to the input of the second component (bottom) model, M' and causes it to output a final Success after ta' for a total response time of ta + ta'.
In the case of the parallel composition, the Activate input starts both models simultaneously and results in a Success output from the quickest model at time min{ta,ta'}.
These paired configurations are easily generalized to finite numbers of components where, for the serial composition, components are placed in a sequence with the Success output port of one connected to the Activate input port of the next, and for the parallel composition, all components receive activation simultaneously with all output Success ports coupled to the overall output Success port.The desired outcome of the simulations of the compositions is the probability distribution of time needed for success.In the serial case, a sampled value of this outcome is the sum of the time samples from each component since all have to succeed for the whole to succeed.In the parallel case, a sampled outcome is the minimum of the sampled durations of the components since overall success is achieved by the first to succeed.Analytic solutions such as those by Rice [38] are possible given analytic input distributions, but stochastic simulation is needed otherwise.

Computation Time Required for Serial and Parallel Compositions
In application to serial and parallel compositions, the number of components determines the depth of the tree, n.The temporal distribution of each component is assumed to be known and represented by a discrete probability density over an interval divided into G segments, where G is called the number of granules and determines the accuracy of the computed outcome.In the following, we simplify the discussion by considering only the case where the probability of failure is zero.
Tree expansions for the series and parallel compositions in Figure 12 evolve with G branches stemming from each node.For the serial compositions, with the first model at level 1, each of the G branches from the root is labelled by the time advance corresponding to the granule represented by that branch.For each of the nodes generated by such branching, the second model's response is represented by the branching of G nodes at level 2 labelled by the time advances corresponding to each.Response times, t = ta + ta', for the composition are accumulated in the nodes terminating the paths labelled by the pairs of branches.In the case of parallel composition, the only difference is that the minimum operation is applied to the pair of time advances to be stored in the node terminating the corresponding path.Therefore, we see that in both serial and parallel compositions, the unmerged tree expands with G n nodes at depth n.However, as the tree front expands, applying the merging process to nodes at the same level greatly reduces the number of nodes that must be considered as the depth increases.Merging in the case of serial composition only adds G nodes at each level, thus reducing the growth in nodes to nG.This is so since at each successive level the range of sum ta + ta' is bounded above by max{ta} + max{ta'}.Thus, after merging, only G new nodes are added to that level.Moreover, at each level, the number of operations is at most nG 2 since the nG nodes are combined with the G granules to create the next level.Thus, the incremental computation time is nG 2 , and the computation time required should grow as O((nG) 2 ), i.e., as the square of depth and number of granules.
Merging in the case of parallel composition does not suffer any tree expansion since the combined range of a minimization is the original range, i.e., range of min{ta,ta'} is bounded above by min{max{ta},max{ta'}}.Thus, by the reasoning above, the number of nodes grows as n*G and the operations grow as nG 2 .So, the computation time required for depth n should grow as O(nG 2 ), i.e., linearly with the depth and square of granules.

Related Work
DEVS has served as a basis for the formalization [38,39] and study of multiresolution constructions underlying multifidelity simulations [40][41][42][43][44][45].Also, DEVS has been employed as a basis for the simulation of Markov Decision Process (MDP) models employing its modular and hierarchical aspects to improve the explainability of the models with application to optimization processes such as financial, industrial, etc. [46][47][48][49][50][51][52].Capocchi, Santucci, and Zeigler [53] introduced a DEVS-based framework to construct and aggregate Markov chains using a relaxed form of lumpability to enhance the understanding of complex Markov search spaces.However, the methodology is limited to selecting optimal partitions according to a metric that compares Markov chains based on their respective steady states.The practical application is limited since these are not generally available in problem specification.In contrast, paratemporal and cloning simulation techniques are intended for application to stochastic simulation in general and offer opportunities for parallelism and cloning of state information.However, they have not demonstrated the ability to overcome the combinatorial explosion of branching arising from multiple choice points.The absence of such scalability presents a major hurdle that must be overcome to implement such techniques.Nutaro et al. [31] showed that the speedup of tree expansion methods is limited by the exponential growth of the tree with increasing depth.Zeigler et al. [54] introduced the merging of states based on homomorphism concepts to mitigate against such growth.Here, we demonstrated that homomorphic merging of states can be formally characterized using DEVS Markov modeling and simulation theory to show examples where such merging can achieve a reduction from exponential to polynomial computational effort.

Conclusions and Further Work
We have developed a formal framework covering conventional and proposed tree expansion algorithms for speeding up stochastic simulations while preserving the desired accuracy.Based on the theory of modeling and simulation, we showed how a reduced deterministic model with random inputs can be derived from such a stochastic model.This reduced model represents the results of cloning state and transition information at branching points.The reduced model was shown to be a homomorphic image of the original based on a correspondence restricted to non-deterministic states and multi-step deterministic sequences mapped into corresponding single-step sequences.An example was discussed to illustrate the tree expansion framework in which the stochastic model takes the form of a binary tree, allowing us to derive the computation times for baseline, nonmerging, and merging tree expansion algorithms to compute the distribution of output values at any given depth.The results show the remarkable reduction from exponential to polynomial dependence on depth effectuated by node merging.We related these results to the reduced computation of binomial coefficients underlying Pascal's triangle.
Applications of node merging tree expansion algorithms are currently being studied in simulations of space-based threat responses to estimate the probability of successfully identifying, tracking, and targeting hypersonic missiles within tight deadlines [38], as well as to attrition modeling employing stochastic interactions between opposing forces [55][56][57][58][59][60].In such models, temporal duration outcomes in the form of probabilistic temporal distributions play a major role, and homomorphic merged tree expansion enables much faster computation of outcome distributions when analytic solutions are lacking.

Figure 1 .
Figure 1.Tree expansion generation of state trajectories (left) and the effect of node merging on tree growth.

Figure 2 .
Figure 2. DEVS-based framework for framing the representations needed for paratemporal simulations.

3 .Algorithm 1 .
The node merging algorithm based on the minimal realization modifies tree expansion by maintaining only the representatives of equivalent classes as successive levels are generated while maintaining the size of the classes as they are developed.The values of the classes are summed weighted by the respective sizes to obtain the desired average.The node merging Algorithm 1 is sketched in the following: Node merging tree expansion algorithm.A node n contains data {s, num) where s is a string in {0,1}* and num is the number of nodes equivalentTo n.The root node = (, 1) //  is the empty string Initiation: Current node = root node; newLeaves = {}, oldLeaves = {(, 1)} Termination: depth = D Output: For each i = 1,2,…,D the number of strings having number of ones equal to i. , n = (s, num) in leaves{ For each branch, b in {0, 1}{ Create child, c = {sb, num) // extend parent's string and inherit parent's number of represented equivalents If c isEquivalentTo some node, m = {t, num') in newLeaves, Then set m = {t, num'+num) Else add c to newLeaves.} depth = depth+1 oldLeaves = newLeaves, newLeaves = {} } n = (s, num) isEquivalentTo m = {t, num') if, and only if, s and t have the same number of ones //Note that only the leaves at each level (including at the final depth) are kept as the expansion advances as required by the required output.

Figure 6 .
Figure 6.Minimal realization of example tree.

Figure 7 .
Figure 7. Charts of measured and predicted computation time resp.(in seconds).

Figure 8 .
Figure 8. Measured speedup of baseline relative to merged tree expansion vs scaled predicted speedup.

Figure 9 .
Figure 9.Serial and parallel compositions of Markov DEVS Success/Fail models.

Figure 10 .
Figure 10.Probability and time transition structure of the DEVS Markov model.

Figure 11 .
Figure 11.Temporal behaviors of serial and parallel compositions.

Figure 12 .
Figure 12.Illustrating tree expansions for series and parallel compositions.

Table 1 .
Analysis of illustrative example.