Categorification of the Müller-Wichards System Performance Estimation Model: Model Symmetries, Invariants, and Closed Forms

The Müller-Wichards model (MW) is an algebraic method that quantitatively estimates the performance of sequential and/or parallel computer applications. Because of category theory’s expressive power and mathematical precision, a category theoretic reformulation of MW, i.e., CMW, is presented in this paper. The CMW is effectively numerically equivalent to MW and can be used to estimate the performance of any system that can be represented as numerical sequences of arithmetic, data movement, and delay processes. The CMW fundamental symmetry group is introduced and CMW’s category theoretic formalism is used to facilitate the identification of associated model invariants. The formalism also yields a natural approach to dividing systems into subsystems in a manner that preserves performance. Closed form models are developed and studied statistically, and special case closed form models are used to abstractly quantify the effect of parallelization upon processing time vs. loading, as well as to establish a system performance stationary action principle.


Introduction
In the late 1980s, D. Müller-Wichards proposed a concise novel approach for estimating the total performance of computer-based (but machine independent) applications by algebraically combining the known performance estimates of the individual arithmetic, data movement, and delay elements that comprise the applications in a manner that also accounts for various degrees of parallelism that can occur during processing [1].The essence of the mathematical framework upon which the Müller-Wichards model (MW) is based is abstracted by the expression ϕ : H → P , where H is a monoid (i.e., a semigroup with identity) representation of the application, P is the Müller-Wichards performance algebra (also a monoid), and ϕ is a monoid homomorphism.The image ϕ(h) in P of a single application element h in H provides a numerical performance measure for h.The total numerical performance estimate for the application results naturally from the associative binary operations in H and P and the homomorphism property of ϕ, which algebraically combines the performance estimates for each application element in H.
Every area of mathematics (e.g., group theory, topology) is described by numerous definitions, theorems, and constructions.However, many common mathematical concepts occur naturally with only slight variation in the various areas of mathematics.Category theory is that branch of mathematics which identifies and studies these common concepts and provides formal mechanisms for mapping them from one area of mathematics to another.More specifically, a category (e.g., the category of sets) consists of a class of objects (e.g., sets), morphisms between objects (e.g., maps between sets), an identity morphism for each object (e.g., the set identity map), and a rule for associatively composing morphisms (e.g., composition of maps).Functors provide formal maps between categories (e.g., from the category of groups to the category of sets) by associating objects and morphisms in different categories subject to the constraints that morphism composition and object identities are preserved.
Because of its generality category, theory has found application in recent years in such diverse areas as physics (e.g., [2][3][4][5]), design specification (e.g., [6,7]), data fusion (e.g., [8]), computer science (e.g., [9]), computer security (e.g., [10,11]), systems engineering (e.g., [12]), manufacturing (e.g., [13]), theoretical biology (e.g., [14][15][16]), network theory (e.g., [17]), multi-agent systems models (e.g., [18]), concurrent system design verification (e.g., [19]), emergence (e.g., [20]), and artificial general intelligence (e.g., [21]).Motivated by category theory's mathematical precision and expressive power, this paper introduces a new category theoretic application useful for numerical systems engineering modelling and analysis via a very simple straightforward categorification of MW for the case that the application monoid H is a free monoid generated by a finite set of basis processes, i.e., H is the set of all finite sequences of basis processes, including the empty sequence-where each sequence represents a system and each basis process corresponds to either an arithmetic process, a data movement process, or a delay process-and catenation of systems serves as the associative binary operation.The categorified MW-denoted by CMW-is comprised of three components: a single object category of systems that is specified by H and has elements of H as its set of morphisms; a single object performance category that is specified by P and has the elements of P as its set of morphisms; and a performance functor specified by ϕ with the category of systems as its domain and the performance category as its codomain.
CMW is effectively numerically equivalent to MW with the added benefit that the formalism introduced by categorification provides a precise vocabulary useful for identifying and discussing certain fundamental properties of both the model and the systems modelled by it.In particular, the CMW fundamental symmetry group is introduced and the category theoretic formalism is used to facilitate the identification of associated symmetry invariants found in the model.In addition, the formalism provides a natural approach to system factorization-i.e., dividing a system into subsystems in a manner that maintains its integrity-i.e., preserves the performance of the undivided system.
As indicated above, the objective of this paper is to provide a categorified version of the MW model and exploit the associated category theoretic vocabulary to provide additional insights into properties of the MW model, as well as the systems modelled by it.In order to make this paper reasonably self-contained, relevant definitions, terminology, and preliminary lemmas are summarized in the next section (for additional depth and clarification the reader is invited to consult such references as [22][23][24]).The remainder of this paper is organized as follows: The Müller-Wichards performance categories are defined, and their category theoretic properties discussed in Section 3. The category of systems and the Müller-Wichards performance functors are introduced in Section 4. The CMW fundamental symmetry group is defined and its associated model invariants are identified in Section 5. System factorization and a "product of categories" performance model are discussed in Section 6. Closed form models are developed in Section 7 and several aspects of closed form models are studied statistically in Section 8. Special case closed form models are developed in Section 9.These special case closed form models are applied in Section 10 to abstractly demonstrate the effect of parallelization on system processing time vs. loading and to obtain a stationary action principle for system performance.Concluding remarks comprise the final section of this paper.To avoid disrupting the flow of the text, proofs for all theorems (including those for the special case closed form models) are consigned to Appendix A.

Categories and Morphisms
A category C consists of a collection Obj C of objects such that ( [22], p. 2)

1.
For every pair of objects X, Y ∈ Obj C there is a (possibly empty) set Mor C (X, Y) of morphisms from X to Y; 2.
For any X, Y, Z ∈ Obj C there is a composition "•" of morphisms given by ( f , g) → g • f with the properties: 3.
For every X ∈ Obj C there is an identity morphism When defined, composition of morphisms is associative, i.e., ( To illustrate this definition, consider the following canonical examples of categories:

•
The category Set where Obj Set is the collection of all sets, the morphisms are the ordinary mappings between sets, and • is the usual composition of maps.

•
The category Grp where Obj Grp is the collection of all groups, the morphisms are the ordinary group homomorphisms, and • is the usual composition of homomorphisms.
It is easily verified that Set and Grp satisfy items 3 and 4 above.A category C is a subcategory of a category D if ( [22], p. 7): every object of C is an object of D; for all objects X, Y of C, Mor C (X, Y) ⊆ Mor D (X, Y); the composition of two morphisms in C is the same as their composition in D; and for all objects X of C, 1 X is the same in D as it is in C. If Obj C is a set, then C is a small category ( [22], p. 6) and if Mor C (X, Y) = ∅ for all X, Y ∈ Obj C , then C is a connected category ( [22], p. 19).
Morphisms are classified in a variety of ways according to their composition properties.Of interest here are monic and epic morphisms.A morphism f is monic For example, in the category Set injective maps are monic morphisms and surjective maps are epic morphisms.A morphism that is both monic and epic is a bimorphism ( [22], p. 15).A morphism h is factorizable ( [22], p. 43) if h = f • g, where f is monic and g is epic.
Another useful notion involving morphisms is the sieve ( [25], p. 206) on an object.For some X ∈ Obj C consider the set of morphisms Note that there are at least two sieves for every X ∈ Obj C , namely Ψ X and the empty sieve ∅.

Functors
Functors can be regarded as morphisms between categories and-in a sense-they provide a "picture" of what one category looks like in another.If F is a covariant functor-or simply a functor (contravariant functors are not used here) ( [22], p. 73)-from category C to category D (denoted F : C → D ), then it assigns to every X ∈ Obj C an FX ∈ Obj D and to every f ∈ Mor C (X, Y) an F f ∈ Mor D (FX, FY) such that:

•
The identity functor 1 C : C → C which makes the assignments 1 C X = X for every X ∈ Obj C and 1 C f = f for every f ∈ Mor C (X, Y).

•
The forgetful functor U : Grp → Set which assigns to every group G ∈ Obj Grp its underlying set UG ∈ Obj Set and to each homomorphism f ∈ Mor Grp (G, H) the set map U f ∈ Mor Set (UG, UH) (i.e., U forgets group structure going from Grp to Set).
It can be determined by inspection that these functors satisfy the required properties given above by items 1 and 2. It is readily deduced from item (2) that, in general, functors preserve commutative triangles of morphisms.For example, if Figure 1 is the commutative triangle for f • g = h in C, then it must be

•
The forgetful functor  ∶  →  which assigns to every group  ∈   its underlying set  ∈   and to each homomorphism  ∈   (, ) the set map  ∈   (, ) (i.e.,  forgets group structure going from  to ).
It can be determined by inspection that these functors satisfy the required properties given above by items 1 and 2.
Preservation and reflection are two important features of functors.A functor :  →  preserves a categorical property ( [22], p. 97)  if-whenever an object, morphism, or diagram has property  in , then the image under  of that object, morphism, or diagram has property  in .Similarly,  reflects property ( [22], p. 97)  if-whenever the image under  of an object, morphism, or diagram has property  in , then that object, morphism, or diagram has property  in .
It is readily deduced from item (2) that, in general, functors preserve commutative triangles of morphisms.For example, if Figure 1 is the commutative triangle for  ∘  = ℎ in , then it must be the case that the commutative triangle in Figure 2 is the preserved image of Figure 1 under the functor:  in .Note that Figure 1 is a factorization of ℎ when  is monic and  is epic.

Preliminary Lemmas
The following lemmas are needed to prove and discuss the main results of this paper.They have been established elsewhere and are stated here without proof for the reader's convenience.
Lemma 1 [9].Any monoid  specifies a category  with  as its only object, the elements of  as its only set of morphisms, and the binary operation on  as its composition of morphisms.
Lemma 3 [24].Let ℱ be the free monoid on a set  and  be any monoid.If  is any mapping of  into , then  can be extended in one and only one way to a homomorphism  of ℱ into  as (  ⋯  ) =  ( ) ( ) ⋯  ( ).

•
The forgetful functor  ∶  →  which assigns to every group  ∈   its underlying set  ∈   and to each homomorphism  ∈   (, ) the set map  ∈   (, ) (i.e.,  forgets group structure going from  to ).
It can be determined by inspection that these functors satisfy the required properties given above by items 1 and 2.
Preservation and reflection are two important features of functors.A functor :  →  preserves a categorical property ( [22], p. 97)  if-whenever an object, morphism, or diagram has property  in , then the image under  of that object, morphism, or diagram has property  in .Similarly,  reflects property ( [22], p. 97)  if-whenever the image under  of an object, morphism, or diagram has property  in , then that object, morphism, or diagram has property  in .
It is readily deduced from item (2) that, in general, functors preserve commutative triangles of morphisms.For example, if Figure 1 is the commutative triangle for  ∘  = ℎ in , then it must be the case that the commutative triangle in Figure 2 is the preserved image of Figure 1 under the functor:  in .Note that Figure 1 is a factorization of ℎ when  is monic and  is epic.

Preliminary Lemmas
The following lemmas are needed to prove and discuss the main results of this paper.They have been established elsewhere and are stated here without proof for the reader's convenience.
Lemma 1 [9].Any monoid  specifies a category  with  as its only object, the elements of  as its only set of morphisms, and the binary operation on  as its composition of morphisms.
Lemma 3 [24].Let ℱ be the free monoid on a set  and  be any monoid.If  is any mapping of  into , then  can be extended in one and only one way to a homomorphism  of ℱ into  as (  ⋯  ) =  ( ) ( ) ⋯  ( ).

Preliminary Lemmas
The following lemmas are needed to prove and discuss the main results of this paper.They have been established elsewhere and are stated here without proof for the reader's convenience.
Lemma 1 [9].Any monoid M specifies a category M with M as its only object, the elements of M as its only set of morphisms, and the binary operation on M as its composition of morphisms.
Lemma 3 [24].Let F X be the free monoid on a set X and S be any monoid.If ϕ 0 is any mapping of X into S, then ϕ 0 can be extended in one and only one way to a homomorphism ϕ of F X into S as ϕ(x Lemma 4 [22].Every faithful functor reflects monics, epics, and commutative triangles. Lemma 5 [24].A free monoid F X is cancellative (i.e., uv = uw and vu = wu implies v = w for u, v, w ∈ F X ) and equidivisible (i.e., uv = wz implies there exists an x such that either u = wx and z = xv or w = ux and v = xz for u, v, w, x, z ∈ F X ).
As an example of equidivisibility, suppose uv = wz is a commutative square in a free monoid with u = a 1 a 2 a 3 a 4 a 5 , v = a 6 a 2 a 4 a 7 , w = a 1 a 2 , and z = a 3 a 4 a 5 a 6 a 2 a 4 a 7 .Then there exists an x = a 3 a 4 a 5 such that u = wx and z = xv.

The Müller-Wichards Performance Categories
As mentioned in the introduction, in 1988 Dieter Müller-Wichards posited a (computing) machine independent "performance algebra" designed to estimate the total performance of a specific implementation of an application on a machine using the (known) performance characteristics of its individual building blocks.This approach enabled trade-off studies and analyses of applications using its various decompositions and implementations [1].As indicated above, similar studies and analyses can be performed using CMW to estimate the numerical performance of any system that can be represented as sequences of arithmetic, data movement, and delay processes.
The Müller-Wichards performance algebra consists of a set whose elements are ordered in triples along with an associative binary operation, which defines how the elements of the set are to be multiplied.These elements are partitioned into three distinct subsets depending upon whether they are arithmetic, data movement, or delay elements.The first and second entries in each triple correspond to a performance value (e.g., a processing or data movement rate) and a weight value (e.g., the number of operations to be performed or the amount of data to be moved), respectively (delay elements are a special case-see below).The third entry is either a 0-which defines the element as a data movement or delay element-or a 1-which defines the element as an arithmetic element.Multiplication of an arithmetic element by any element produces another arithmetic element, whereas the product of a data movement element with a data movement element or a delay element is a data movement element.The set of delay elements is closed under this multiplication.
Multiplication is defined in terms of a positive real valued parameter q, which provides a family of ("skewed" time) results that interpolate between completely sequential and completely parallel execution of the associated application.The value of this parameter can be selected to account for execution time "speedup" or for degrees of machine and application parallelism.
The sets that define the Müller-Wichards performance algebra are: R + ≡ the set of positive real numbers, and Each element in C, D, and V is a triple g = (r, w, u), where r is a performance value, w is a weight value, and u ∈ {0, 1} determines the element type (i.e., C, D, or V).
Since (℘, ⊗ q ) is the monoid M q , then (℘, ⊗ q ) specifies a category according to the prescription given by the next theorem.
Theorem 2. (℘, ⊗ q ) specifies the Müller-Wichards performance category ℘ q with M q as its only object, Mor ℘ q M q , M q = ℘ as its only morphism set, and composition of morphisms defined by ⊗ q .
In order to identify additional categorical properties derived from the algebraic structure of This leads to the following result: Theorem 4. If X ∈ {C, D, CD, V}, then ℘ X , ⊗ q specifies the Müller-Wichards performance category ℘ X q with M X q as its only object, Mor ℘ X q M X q , M X q = ℘ X as its only morphism set, and composition of morphisms defined by ⊗ q .Note that although Mor ℘ X q M X q , M X q ⊂Mor ℘ q M q , M q , ℘ X q is not a subcategory of ℘ q because M X q = M q .
Theorem 5.Each of the categories ℘ q and are small and connected.
The following result is included for completeness and is a category theoretic statement of the fact that multiplication in the performance algebra is biased towards arithmetic elements, i.e., the product of any element in set ℘ ≡ C ∪ D ∪ V with an element in set C is an element in C. Theorem 7. The set C is an M q -sieve in category ℘ q .

The Category of Systems and the Müller-Wichards Performance Functors
In this section, the category of systems is obtained from the free monoid I A generated by a finite set A of basis processes.Although this approach is similar to that employed in [1] to define H, for simplicity the catenation operation in I A is used here instead of the two operations ⊗ and ⊕ used in H to distinguish between portions of an application that have sequential and parallel implementations.A product of categories construction will be introduced for this purpose in Section 6.
Recall from Section 1 that I A consists of all finite sequences of basis processes of A called systems.The catenation of systems u, v ∈ I A is the system w = uv ∈ I A and u and v are subsystems of w.The empty system ε ∈ I A is the system with no basis processes and serves as the identity element for I A with εu = uε = u.
The category of systems A associated with a basis process set A is obtained from I A , as prescribed in the following theorem.Theorem 8.I A specifies the category of systems A with F A as its only object, Mor A (F A , F A ) = I A as its only morphism set, and composition of morphisms defined by catenation of systems.
Thus, a morphism in A is a system and composition of morphisms in A is catenation of systems.Theorem 9. A is a small connected category.
Theorem 10.Every morphism in A is a bimorphism.
Any functor from A into ℘ q or ℘ X q , X ∈ {C, D, CD, V}, is referred to here as a Müller-Wichards performance functor for A and the performance for any system in A is its image under a Müller-Wichards performance functor in ℘ q or ℘ X q , X ∈ {C, D, CD, V}.These functors are defined by unique extensions of gauge maps from the set of basis processes into the associated Müller-Wichards performance monoids as described by the following theorem (here "gauge" emphasizes the fact that these maps set the performance gauge for each basis process).

Theorem 11. For any gauge map ϕ
[ where a 1 a 2 • • • a n ∈ Mor A (F A , F A ).

The CMW Fundamental Symmetry Group and Associated Invariants
Based upon the discussion above, it is clear that CMW can be abstracted by the functor expressions F ϕ 0 : A → ℘ q and F X ϕ 0 : A → ℘ X q , X ∈ {C, D, CD, V} .In this section, the CMW fundamental symmetry group is introduced and associated model invariants (which at an abstract level can be viewed as invariants of the modelled system) are identified.Recall that, in general, a symmetry associated with a "situation" is related to an "immunity to change" for some aspect of the "situation".In order for a "situation" to have a symmetry: (i) the aspect of the "situation" remains unchanged or invariant, when a change or symmetry transformation acts upon the "situation"; and (ii) it must be possible to perform the change, although the change does not actually have to be performed [26].
Here, symmetry and symmetry transformation are used interchangeably.
An automorphism of A is a bijection α : Mor A (F A , F A ) → Mor A (F A , F A ) , which preserves the catenation operation in I A .Since Mor A (F A , F A ) remains unchanged under α and it is possible-but not necessary-to apply α, then items (i) and (ii) above are satisfied and α is a A symmetry.The set of all such symmetries under the operation composition of bijections forms the automorphism group Aut( A ).This group is the CMW fundamental symmetry group.
Theorem 12.The CMW fundamental symmetry group is isomorphic to the group of permutations of the set A of basis processes.
The following corollary is obvious and is stated without proof.

Corollary 1. |Aut( A )| = |A|!
A CMW invariant is a CMW property that remains unchanged after every symmetry in the CMW fundamental symmetry group has been applied to Mor A (F A , F A ).

Theorem 13. The images of
A in ℘ q and ℘ X q under the Müller-Wichards performance functors F ϕ 0 and F X ϕ 0 , X ∈ {C, D, CD, V}, respectively, are CMW invariants.
Theorem 14.If ∆ is a commutative triangle in A , then ∆ and its image under Müller-Wichards performance functors are CMW invariants.

System Factorization and Product Performance Models
Recall from Section 2 that a morphism h is factorizable if h = f • g, where f is a monic morphism and g is an epic morphism.In the category of systems, this means that when a system w is factorizable as w = uv, then w can be divided into the two subsystems u and v without affecting the order and content of the base processes in w.The integrity of a factorization is maintained if the performance of w is identical to that of uv.Theorem 15.Every commutative triangle in A corresponds to a system factorization, which maintains its integrity.
A commutative square in A corresponds to an equation uv = wz, where u, v, w, z ∈ Mor A (F A , F A ).
Theorem 16.For every commutative square in A there are two systems in the square, which have factorizations that maintain their integrity.
Corollary 2. Whenever uv = wz is a commutative square in A , then either F X ϕ 0 u and F X ϕ 0 z or F X ϕ 0 w and F X ϕ 0 v are factorizable in ℘ X q , X ∈ {C, D, V}.
Corollary 3. If F ϕ 0 u = F ϕ 0 v ⊗ q F ϕ 0 w , thenu is factorizable in A and maintains its integrity when F ϕ 0 is faithful.
Results similar to Corollary 3 also apply for F X ϕ 0 , X ∈ {C, D, CD, V}.Additional modelling flexibility can be obtained using products of categories when a system is comprised of subsystems that have different sequential and/or parallel characteristics.Here, two basis sets A and B of processes and two morphism compositions ⊗ q and ⊗ p form the product category of systems AB = A × B and the associated Müller-Wichards product performance category ℘ qp = ℘ q × ℘ p , respectively.For the product category AB the single object is the pair with composition performed component wise.Similarly, for the product category ℘ qp the object is the pair Obj ℘ qp = Obj ℘ q × Obj ℘ p = M q , M p ≡ M qp , the set of morphisms are ordered pairs Mor ℘ qp M qp , M qp = Mor ℘ q M q , M q × Mor ℘ p M p , M p such that for g qp ≡ g q , g p ∈ Mor ℘ qp M qp , M qp , and g p : M p → M p with composition performed component wise.It is easily verified that AB and ℘ qp are categories where 1 F AB = (ε, ε) and 1 M qp = ((∞, i, 0), (∞, i, 0)) are the object identities.Now use the gauge maps ϕ 0 : A → (℘, ⊗ q ) and θ 0 : B → (℘, ⊗ p ) to define F ϕ 0 ,θ 0 : Thus: Of course, a similar construction can be made using products of a finite number of categories of systems and Müller-Wichards performance categories.
Thus, generally distinct performance triples g q ≡ r q , w q , u q and g p ≡ r p , w p , u p are obtained, which yield separate performance estimates for systems comprised of A basis processes and for systems comprised of B basis processes, respectively.Obviously, unless q = p, g q and g p cannot be properly combined algebraically to obtain a single performance estimate for a process comprised of both A and B basis processes.However, an inequality can be established by letting g 1 = g q , g 2 = (∞, i, 0) = g 3 , and g 4 = g p in Lemma 2.6 in [1].These yields g q ⊗ p g p ≥ g q ⊗ q g p when 1 ≤ q ≤ p ≤ ∞.An alternative approach is to estimate the combined processing time for systems composed of A processes and for systems composed of B processes from g q and g p using t AB = t A + t B , where t A = |wq| r q and t B = |wp| r p .

Closed Form Models
In this section, the above theory is used to develop general closed form system performance estimation models where a system's performance r and weight w are explicit functions of the performances and weights of the processes that comprise the system.The next theorem provides models for systems comprised entirely of a finite number of arithmetic processes or data movement processes.In what follows, let where x = 1 or 0 when X = C or D, respectively, w = ∑ n i=1 w i , and r = w Now consider systems comprised entirely of a finite number of delay processes.
where r = 1 The last two theorems can be used to construct closed form models for systems comprised of finite numbers of arithmetic, data movement, and delay processes.Theorem 19.Suppose the system a 1 a 2 • • • a n ∈ Mor A (F A , F A ) is comprised of n C arithmetic processes, n D data movement processes, and n V delay processes such that n C + n D + n V = n.Then with r C and w C given by Theorem 17 when X = C and n = n C ; r D and w D given by Theorem 17 when X = D and n = n D ; and r V given by Theorem 18 when n = n V .

Statistical Properties of r for Closed Form Models
Since system performance studies are often stochastic in nature, it is instructive to examine the statistical characteristics of the system performance variable r for several cases using the closed form models of Theorem 17 for arithmetic and data movement elements.

Fixed Rates and Independent Random Weights
Assume that for both element types each rate r i = r I , i ∈ N, where r I is a fixed value, and each weight w i , i ∈ N, is an independent random variable described by the same Poisson distribution with mean λ.In this case-for a fixed q, n, and λ-Equation (15) can be written as: Note that it necessarily follows that ∑ n i=1 w i is also Poisson distributed with mean nλ.Using these assumptions, 10 4 trials were generated for each λ and q combination, where λ ∈ {10, 20, 40} and q ∈ {1.5, 2, 2.5, 3, 4, ∞}, when n = 10 and r I = 1.A kernel estimator of the probability density function (pdf) for the overall system rate r associated with each combination is shown in Figure 3. Observe from Equation (20) that since this figure is generated using r I = 1, the horizontal axes of this figure can be interpreted as the ratio r/r I .Thus, multiplying each pdf by 0 < r I = 1 yields the pdf for the associated r I .
Inspection of Figure 3 quantifies-for a system comprised of a fixed number of basis processes and a fixed processing rate-the intuitively pleasing facts that: (i) for a fixed q value, the peak value of the pdf increases, the r value of the peak of the pdf effectively remains fixed, and the width of the pdf narrows as the mean process weight λ increases (i.e., the probability that the value of the overall system rate r will be within a small fixed interval about the peak r value increases with increasing mean process weight); and (ii) for a fixed mean process weight, the peak value of the pdf for r increases and the width of the pdf broadens as q increases in value (i.e., the overall peak system rate r increases and the probability that r will be within a small interval about the peak rate decreases with increasing system "speedup" or parallelism).
Kernel estimator pdfs were also obtained for r using 10 4 trials and Equation ( 20) with r I = 1 for n ∈ {2, 4, 8, 10, 16} and q ∈ {1.5, 2, 2.5, 3, 4, ∞}.The weights for each trial were determined from a Poisson distribution with λ = 10.These results are presented in Figure 4 where-as expected from the Figure 3 results-it is seen for each n that the peak r value increases, the distribution broadens, and the value of the pdf peak decreases with increasing q.Also expected is the fact that-although the pdf peak values and distribution width effectively remain the same-the distributions shift to larger r values with increasing q as n increases.Inspection of Figure 3 quantifies-for a system comprised of a fixed number of basis processes and a fixed processing rate-the intuitively pleasing facts that: (i) for a fixed  value, the peak value of the pdf increases, the  value of the peak of the pdf effectively remains fixed, and the width of the pdf narrows as the mean process weight λ increases (i.e., the probability that the value of the overall system rate  will be within a small fixed interval about the peak  value increases with increasing mean process weight); and (ii) for a fixed mean process weight, the peak value of the pdf for  increases and the width of the pdf broadens as  increases in value (i.e., the overall peak system rate  increases and the probability that  will be within a small interval about the peak rate decreases with increasing system "speedup" or parallelism).
Kernel estimator pdfs were also obtained for  using 10 trials and Equation (20) with  = 1 for  ∈ 2,4,8,10,16 and  ∈ 1.5,2,2.5,3,4,∞ .The weights for each trial were determined from a Poisson distribution with  = 10.These results are presented in Figure 4 where-as expected from the Figure 3 results-it is seen for each  that the peak  value increases, the distribution broadens, and the value of the pdf peak decreases with increasing .Also expected is the fact that-although the pdf peak values and distribution width effectively remain the same-the distributions shift to larger  values with increasing  as  increases.

Random Rates and Random Weights
Now consider trials where for each trial-in addition to having Poisson distributed random weights as just described-the rates r i in Equation ( 15) are also randomly assigned using the exponential distribution: Using this methodology, kernel estimator pdfs for r were obtained using 10 4 trials per combination for a range of γ values and each q ∈ {1.5, 2, 2.5, 3, 4, ∞} when λ = 10 and n = 10.These results are presented in Figure 5 for γ ∈ {2, 10} and in Figure 6 when q is also unity and γ ∈ {1, 2, 3}.Each of the pdfs in the two lower Figures in Figure 6 are scaled by their γ values, i.e., all of the resulting r values are divided by their respective γ values, so that the pdfs for all γ values fit on the same horizontal axis value range.

Random Rates and Random Weights
Now consider trials where for each trial-in addition to having Poisson distributed random weights as just described-the rates  in Equation ( 15) are also randomly assigned using the exponential distribution: Using this methodology, kernel estimator pdfs for  were obtained using 10 trials per combination for a range of  values and each  ∈ 1.5,2,2.5,3,4,∞ when  = 10 and  = 10.These results are presented in Figure 5 for  ∈ 2,10 and in Figure 6 when  is also unity and  ∈ 1,2,3 .Each of the pdfs in the two lower Figures in Figure 6      Observe from these figures that-when compared with the previous results for fixed  -the inclusion of random rates causes the  parameterized pdfs to be closer together and overlap.The presence of randomness increases the variance associated with each  pdf and, interestingly, tends to significantly decrease the sensitivity of the pdfs to the value of .This comparison is made more clear in Figure 7 where the graphs of the pdfs for  = 10,  = 10, and  ∈ 1.5,2,2.5,3,4are placed one above the other for the case where both the rates and weights are randomly selected as above when  = 10 (upper  Probability density functions for r when n = 10, λ = 10, γ ∈ {1, 2, 3}, and q ∈ {1, 1.5, 2, 2.5, 3, 4, ∞}.Each r value for γ ∈ {2, 3} has been divided by the associated value of γ.
Observe from these figures that-when compared with the previous results for fixed r I -the inclusion of random rates causes the q parameterized pdfs to be closer together and overlap.The presence of randomness increases the variance associated with each r pdf and, interestingly, tends to significantly decrease the sensitivity of the pdfs to the value of q.This comparison is made more clear in Figure 7 where the graphs of the pdfs for n = 10, λ = 10, and q ∈ {1.5, 2, 2.5, 3, 4} are placed one above the other for the case where both the rates and weights are randomly selected as above when γ = 10 (upper

Special Case Closed Form Models
Here, Theorems 17-19 are used to produce several closed form models when the performance values and the weight values are assumed to be equal for all processes in a system.Such processes are equal processes and the associated system models are special case closed form models (SCCFMs).SCCFM 1.If   ⋯  ∈    (ℱ , ℱ ) is a system such that  ( ) = (, , 1) ∈ ,  ∈  , then    ⋯  = (, , 1) , where  =  and  =   .The time required for the system to complete its arithmetic operations is  =  .
Since the next model is also a direct consequence of Theorem 17 and its proof follows that of the previous model, it is stated without proof.SCCFM 2. If   ⋯  ∈    (ℱ , ℱ ) is a system such that  ( ) = (, , 0) ∈ ,  ∈  , then    ⋯  = (, , 0) , where  =  and  =   .The time required for the system to complete moving all of its data is  =  .

Special Case Closed Form Models
Here, Theorems 17-19 are used to produce several closed form models when the performance values and the weight values are assumed to be equal for all processes in a system.Such processes are equal processes and the associated system models are special case closed form models (SCCFMs).
, where w = nλ and r = n 1− 1 q ρ.The time required for the system to complete its arithmetic operations is t C q = n 1 q λ ρ .Since the next model is also a direct consequence of Theorem 17 and its proof follows that of the previous model, it is stated without proof.
, where w = nω and r = n 1− 1 q σ.The time required for the system to complete moving all of its data is , where r = n − 1 q δ.The time delay for this system is t Note that: (i) when q = 1, the processing is sequential and as required, the time t X 1 , X ∈ {C, D, V}, for the systems to complete their processing is the sum of the n individual times required to complete each process in the system; (ii) when 1 < q < ∞, the processing time is "skewed" or "compressed" and t X q < t X 1 .SCCFM 4. Suppose the system a 1 a 2 • • • a n ∈ Mor A (F A , F A ) is comprised of n C equal arithmetic processes, n D equal data movement processes, and n V equal delay processes such that n C + n D + n V = n.Then: where r C and w C are given by SCCFM 1 when n = n C ; r D and w D are given by SCCFM 2 when n = n D ; r V given by SCCFM 3 when n = n V ; and: The time required for this system to complete its processing is: Again, note that when q = 1, t CDV 1 is the sum of the system delays, the time required for the system to complete processing its arithmetic operations, and the time required for the system to complete its data movement operations.

Applications of Special Case Closed Form Models
Special case closed form models are useful for understanding and describing fundamental properties and dynamics of CMW system performance models.The following subsections provide several examples of this.
10.1.The Effect of q Upon the t X q − n X Dependence, X ∈ {C, D, V} It is easily seen from SCCFM1-SCCFM3 that the differential of t X q (with respect to n X ) can be written as: Dividing both sides of this equation by t X q yields: which can be written as or as It follows from this that for these SCCFMs, ln t X q varies linearly with ln n X with an associated slope of q −1 .This implies the intuitively pleasing result that for SCCFM1-SCCFM3, increasing the system's parallelization (i.e., increasing the q value) decreases the processing time t X q -regardless of the number n X of basis processes that must be processed.

A Stationary Action Principle for SCCFM4 System Performance
Consider SCCFM4, assume for fixed q that t C q q + t D q q + t V q q = χ(t) ≡ χ is time dependent, and let the function: where .χ ≡ dχ dt , describe the system's performance with time.Using Equation (29), it is found that: i.e., L q satisfies the Euler-Lagrange equation.Consequently, L q is the performance Lagrangian for the system, Equation (30) is the associated equation of motion for the system's performance, and the integral is stationary so that its first variation δ vanishes (e.g., [27]).This has the following interpretation: If ∑ is the configuration space χ × t ⊂ R + × R + and processing begins at t = 0 and is completed at t = τ, then the actual processing path followed by t C q q + t D q q + t V q q in ∑ during the fixed processing interval [0, τ] is such that: with respect to all path variations in ∑, which vanish at the end-points of the interval.

Invariance of the Equation of Motion
Note that for any arbitrary (twice differentiable) function φ(χ), the transformed performance Lagrangian L # q ≡ L q + .φ (33) also describes the processing path followed by t C q q + t D q q + t V q q in ∑ during the fixed processing interval [0, τ] since: Thus, although the performance Lagrangian is not unique, by extension, it can be concluded that the equation of motion for the system's performance is invariant under all transformations of the performance Lagrangian of the form given by Equation (33).

Concluding Remarks
This paper has presented a category theoretic reformulation of the Müller-Wichards system performance model.The use of category theoretic terminology provides a precise and efficient mathematical vocabulary for defining categories of systems, system performance, and system performance functors.These functors were shown to be useful for discussing factorizing systems without changing their performance and identifying model symmetries and invariants.Such formal mathematical properties of models tend to characterize aspects of the real systems that the models represent.
A practical feature of the model is that it can be implemented in a relatively straightforward manner as a software package.A user can assign to each basis process-via a gauge map-its numerical triple in the performance category.This defines for each basis process its type (arithmetic, data movement, or delay element), rate, and weight.The associated performance functor then uses these assignments to automatically generate a final performance triple for any system (string of basis processes) in the category of systems defined by the process basis set.Several other useful features of the model include using closed form performance models to provide "quick" performance estimates, as well as to provide "sanity checks" for results obtained from software implementations of the model; generating stochastic-based performance studies by treating rates and weights as random variables (as illustrated in Section 8); and applying closed form models to abstractly characterize aspects of system performance (as illustrated in Section 10).
Future research involves defining an appropriate metric d on Mor A (F A , F A ) that measures the similarity between systems; and finding an approach for classifying subsets of interest in the associated metric space Mor A (F A , F A ), d such that the systems within such a class are very similar [28].An implementation of this classification scheme in a software package, which also Proof of Theorem 15.Let w = uv be a commutative triangle in A .It follows from Theorem 10 that u and v are bimorphisms in which case u is monic and v is epic and w = uv is a factorization of w.Since functors preserve commutative triangles, then F ϕ 0 w = F ϕ 0 u ⊗ q F ϕ 0 v is the associated commutative triangle in ℘ q .It is clear that the integrity of the factorization of w is maintained because F ϕ 0 w = F ϕ 0 uv = F ϕ 0 u ⊗ q F ϕ 0 v. Similarly for the functors F X ϕ 0 , X ∈ {C, D, CD, V}.
Proof of Theorem 16.If uv = wz is a commutative square in A , then, because I A is equidivisible (Lemma 5), there exists an x ∈ Mor A (F A , F A ) = I A such that (i) u = wx and z = xv, or (ii) w = ux and v = xz.These are factorizations because every morphism in A is a bimorphism (Theorem 10) so that in (i) w and x are monic and x and v are epic, or in (ii) u and x are monic and x and z are epic.
Because each of these factorizations corresponds to a commutative triangle in A , it follows from Theorem 15 that they maintain their integrity.
Proof of Corollary 2. If uv = wz in Mor A (F A , F A ), then it must be the case that there is an x ∈ Mor A (F A , F A ) = I A such that u = wx and z = xv or w = ux and v = xz (Lemma 5).Since u = wx, z = xv, w = ux, and v = xz are commutative triangles in A , they are preserved by F X ϕ 0 in ℘ X q , X ∈ {C, D, V}, so that F X ϕ 0 u = F X ϕ 0 w ⊗ q F X ϕ 0 x and F X ϕ 0 z = F X ϕ 0 x ⊗ q F X ϕ 0 v or F X ϕ 0 w = F X ϕ 0 u ⊗ q F X ϕ 0 x and F X ϕ 0 v = F X ϕ 0 x ⊗ q F X ϕ 0 z.The result follows from the fact that every morphism in the image of F X ϕ 0 in ℘ X q , X ∈ {C, D, V}, is a bimorphism so that F X ϕ 0 w and F X ϕ 0 x are monic and F X ϕ 0 x and F X ϕ 0 v are epic, or F X ϕ 0 u and F X ϕ 0 x are monic and F X ϕ 0 x and F X ϕ 0 z are epic.
Proof of Corollary 3. If F ϕ 0 is faithful, then the commutative triangle F ϕ 0 u = F ϕ 0 v ⊗ q F ϕ 0 w in ℘ q reflects to the commutative triangle u = vw in A (Lemma 4).Since every morphism in A is a bimorphism (Theorem 10), then u is factorizable because v is monic and w is epic.Since F ϕ 0 u = F ϕ 0 v ⊗ q F ϕ 0 w, the factorization of u maintains its integrity.
Proof of Theorem 17.That the action of the functor on a 1 a 2 • • • a n is as stated in the consequence of the theorem follows from Theorem 11 and the fact that ϕ 0 (a i ) = (r i , w i , x), i ∈ N, where x = 1 or 0 when X = C or D, respectively.The remainder of the proof is by induction.(i) For = 0,  q + w 3 r 3 q + ( w 2 r 2 ) q + ( w 2 r 2 ) For the product (r C , w C , 1) ⊗ q (r D , w D , 0) ⊗ q (r V , i, 0), combining w CD = w C with the weight of the third triple gives a final combined weight of w CDV = Re[1•w CD + 0•i + ¬(1V0)(w CD + i)] = w CD = w C so that: Proof of SCCFM1.The results for w and r follow from Theorem 17 since w = ∑ n i=1 λ = nλ and r = nλ Preservation and reflection are two important features of functors.A functor F : C → D preserves a categorical property ([22], p. 97) π if-whenever an object, morphism, or diagram has property π in C, then the image under F of that object, morphism, or diagram has property π in D. Similarly, F reflects property ([22], p. 97) π if-whenever the image under F of an object, morphism, or diagram has property π in D, then that object, morphism, or diagram has property π in C.

Figure 2 .
Figure 2. The preserved image of Figure 1 under  in .

Figure 1 .
Figure 1.The commutative triangle for f • g = h in C.

Figure 2 .
Figure 2. The preserved image of Figure 1 under  in .

Figure 2 .
Figure 2. The preserved image of Figure 1 under F in D.
are scaled by their  values, i.e., all of the resulting  values are divided by their respective  values, so that the pdfs for all  values fit on the same horizontal axis value range.
Figure) and the case where the rates all have unit value and only the weights are Poisson distributed (lower Figure).Each of the pdfs in the upper Figure are scaled by the associated  = 10 value.