Design and Verification of Multi-Agent Systems with the Use of Bigraphs

Widespread access to low-cost, high computing power allows for increased computerization of everyday life. However, high-performance computers alone cannot meet the demands of systems such as the Internet of Things or multi-agent robotic systems. For this reason, modern design methods are needed to develop new and extend existing projects. Because of high interest in this subject, many methodologies for designing the aforementioned systems have been developed. None of them, however, can be considered the default one to which others are compared to. Any useful methodology must provide some tools, versatility, and capability to verify its results. This paper presents an algorithm for verifying the correctness of multi-agent systems modeled as tracking bigraphical reactive systems and checking whether a behavior policy for the agents meets non-functional requirements. Memory complexity of methods used to construct behavior policies is also discussed, and a few ways to reduce it are proposed. Detailed examples of algorithm usage have been presented involving non-functional requirements regarding time and safety of behavior policy execution.


Introduction
With the increase of computational power and its availability comes the desire to incorporate it more into our daily life. Current ideas on how to do this include the Internet of Things, multi-agent systems (in which particular cases are swarms of robots), or smart objects and places (e.g., cities, homes, cars). All of them require new ways to design large-scale (i.e., consisting of a significant number of elements) software and physical systems that consider both how individual components interact and how a system as a whole works. There are various unresolved problems related to this. There is no consensus on what elements of the real world should be modeled and which of their capabilities should be taken into account in general. What is worse, among different design methods elements of the real world are used differently. Finally, the results of these methods are often incomparable, or at least, there is no common way to evaluate multi-agent system design methods. Regardless, any method for designing complex systems must offer a specific range of capabilities to be considered useful.
The concept of agent is applied to entities that have autonomy and are placed in a changing environment. Multi-agent systems [1,2] are structures within which agents can be identified. One of the advantages of designs using agents is that they can be represented at different levels of detail, from abstract entities (like mathematical structures) to actual robots. For this reason, among others, the concept of multi-agent system is used in various contexts. This term may be used to characterize a group of machine learning methods [3,4].
It can also be used to highlight attributes of certain models and simulation approaches [5][6][7]. The term also refers to a subgroup of robotics solutions [8][9][10][11] that make use of widely understood autonomous robots to perform assigned tasks. In this work, we will focus on multi-agent robotic systems (MARS). The literature [12][13][14][15][16] is replete with examples of various applications of multi-agent robotic systems. There are also methodologies and tools [10,17] to design such systems. There is no consensus on how to design such systems in general and current solutions come from different areas of science. The most common paradigms used to design MARS include software design patterns [16], control theory [12,13], optimization theory or combinations of the above [15]. Some examples are utilizing mathematical logic in MARS design [18], but they are much less common. Due to the lack of agreement on how to design MARS and the fact that results produced by different methodologies are difficult to compare, we will try to evaluate them based on their capabilities. In this paper, we will be interested not so much in how to design MARS but rather how the following questions can be answered about an existing project: • Is the project correctly designed? We want to assure the syntactic correctness, i.e., the correct use of formal tools such as mathematical logic, differential equations, or pi-calculus. We also care about semantic correctness, i.e., the ability to transform a formal model into a real solution (implementable on robots). • How does one perform a simulation illustrating MAS operation? • Have non-functional requirements been met? Those regarding safety and speed of task execution in particular.
Verifying the correctness of a model is the simplest and most solutions can be verified using the tools they were made with. Verifying whether a designed system accomplishes a given task is much more difficult. The vast majority of methodologies in the literature use simulation for this purpose. Exceptions can be found among models that highly formalize the internals of agents, how they operate, and the course of a task itself. Verification by simulation also gets complicated as the model becomes more abstract. The simplest designs in this regard are those based on methods commonly used in other areas of science (such as differential equations or graph theory) or made using tools integrated with a simulator. Verification of non-functional requirements is a difficult part of the design. Methodologies commonly found in the literature such as RE4Gaia [19], TROPOS [20], DIAMOND [21], or Adelfe [22] take into account non-functional requirements during design process. They usually aim to enable design of multi-agent systems in general (not just multi-agent robotic systems). Successive stages in most of these methodologies are not closely coupled together. By loosely coupled process, we understand a design process where a designer's interpretation of how the system works plays a significant role the whole time. In other words, one cannot treat the results of one stage as an input that the next stage will automatically transform into a form acceptable by yet another stage. When it comes to verification of system requirements, it should be noted that none of the above methodologies offer formal guarantees regarding the system's functionality as the methods dedicated to specific tasks do. An example of a such method can be found in [13] where a formal guarantee is given for robots to move keeping at least a specified distance from each other (an example of a non-functional requirement). In [12] a guarantee of fulfillment of functional requirements is presented where a task is guaranteed to be carried out if certain conditions are satisfied.
Using bigraphs [23] to design multi-agent systems is a relatively new approach to modeling this kind of system. The bigraph theory was published by Robin Milner in 2008 but has already been extended with a notion of overlapping locations [24] and probability [25]. Bigraphs are currently found useful in areas such as system of systems design [26], IoT [27], and wireless network modeling [28]. Currently, there are a few tools that support modeling systems with bigraphs, the most notable of them are Bigraphical Model Checker [29] (discontinued), Bigraph Framework for Java [30], and BigraphER [31]. The first two of them focus on checking the reachability of certain states of a system [29,31]. At the same time, the last one provides means to analyze various aspects of a modeled system (especially useful in this regard is underlying OCaml library bigraph). We believe that BigprahER [31] provides the most advanced set of utilities to model systems with bigraphs available at the moment. Multi-agent systems design methodologies [32,33] involving bigraphs are scarce, and most of them do not consider generating behavior policies based on a constructed model. As an exception to this, one may point out BigActor methodology described in [34] that uses bigraphs mixed with the notion of actors [35] or our methodology [36] based on bigraphs with tracking.
In [36] we have proposed a methodology based on bigraphs with tracking [23] that enables design of multi-agent systems. We have chosen tracking bigraphs primarily because they allow for analysis of objects' activities over time without introducing another layer of abstraction (as it was done, for example, in [34]). Our methodology is devoid of some of the drawbacks we mentioned earlier, such as loose coupling between design stages or the designer's interpretation of systems internals on all stages of the design process. Moreover, successive stages of the methodology are module-like which means their implementations can be adjusted to project needs. The methodology's main disadvantages are high computational complexity, limitation of system's agents to entities that can be fully controlled, and the fact that the operation of a designed system is determined before it is started. It also does not offer universal guarantees of task successful completion as presented in [12,13,18]. Putting our work in a broader context, we can place our methodology in a group of bottom-up [37] methods of MAS design with a note that it focuses on global goals rather than individual ones. In fact, agents in our approach do not have preferences that can affect their actions. A distinguishing feature of our proposition is the lack of abstractions outside the bigraphs framework, typically agents' internal mechanics are modeled with BDI (Belief, Desire, and Intention) [32,38] or actors [34].
This work is an extension of the methodology proposed in [36]. This paper aims to demonstrate how to verify the correctness of a design, check the fulfillment of nonfunctional requirements, and visualize behavior policies. We have developed an algorithm to automatically verify the correctness of a model and construct successive simulation states. We also described how to verify whether non-functional requirements are satisfied by a behavior policy for agents in the system. An example implementation [39] of the algorithm has been prepared. We also addressed the memory complexity of operations performed during behavior policy generation. We discussed how it influences the feasibility of projects and suggested a few ways to reduce the memory complexity. Finally, a tool [40] has been implemented that incorporates all of the mentioned memory complexity reduction strategies and a tool [41] to illustrate constructed behavior policies.

Methods and Materials
In this section, we will introduce all terms and definitions that are necessary to understand examples presented in Section 3. Section 2.1 is devoted to basic informal definitions that will be used throughout the rest of this article. Sections 2.2-2.4 aim to quickly acquaint the reader with the methodology described in detail in [36] and for that reason micro-examples are included at the end of each of these subsections. Section 2.5 is dedicated to an algorithm for verification and visualization of behavior policies. Since the algorithm is the key of this article, examples of its usage are presented in Section 3.

Basic Concepts
Before formal definitions, we will introduce the following concepts: • Task-A collection of objects from the real world along with the actions they can perform, the initial state, and the target-desired (final) state(s). An example of a task might be: "In an area that is a 3 × 3 grid, there are two robots in opposite (diagonally) cells. Each robot can move to vertically and horizontally adjacent cells and connect to a second robot if both are in the same cell. The goal of the task is for both robots to connect with each other." • Mission-a realization of a task.
• Task element-a real-world entity that is relevant to the subject matter being modeled. Elements can be people, robots, areas, data sources, and receivers, etc. • Passive object-a task element that can participate in activities without initializing them. It may contain other passive objects. We are not interested in their behavior, but we take into account the passage of time for them. The number of passive objects is constant during a mission. • Active object (agent)-a task element that can participate in activities by initializing them. It can contain other active and passive objects. We are interested in their behavior, and we take into account the passage of time for them. We can control them. It is assumed that the number of agents during a mission is constant. • Environment-a task element that can participate in activities without initializing them. It can contain passive and active objects and be owned by at most one other object. We are not interested in its behavior, and do not consider the passage of time for it. • Behavior Policy-A set of planned actions for all agents that meets the following requirements: -Implementing a behavioral policy solves a given task; -All agents start the mission at the same time; -Agents can complete a mission at different points in time; -All agent activities must be performed continuously (without time gaps); -All agents that participate in a cooperative activity must start performing it at the same moment.
• Scenario-Mission using a specific behavioral policy.

Bigraphs
Through this article we will extensively use bigraphs, concrete bigraphs to be precise. Concrete bigraphs allow identifying its nodes and edges with support (more about that later). In contrast, abstract bigraphs lack the mentioned identifiers. In the rest of this article, whenever we refer to a bigraph, we will have a concrete bigraph in mind. A bigraph consists of two graphs: a place graph and a link graph. Place graph is intended to model spatial relations between system elements. A link graph is a hypergraph that can be used to model interlinking between the elements.
Formally a bigraph is defined as: • V B -a set of vertices identifiers; • E B -a set of hyperedges identifiers. A union of both of these sets makes the bigraph support; • ctrl B : V B → K-a function assigning a control type to vertices. K denotes a set of control types and is called a signature of the bigraph; denote a place and a link graph respectively. A prnt B function defines hierarchical relations between vertices, roots, and sites. A link B function defines linking between vertices and hyperedges in the link graph; • I = m, X and O = n, Y denotes the inner face and outer face of the bigraph B. By m, n we will denote sets of preceding ordinals of the form: m = {0, . . . , m − 1}. Sets X and Y represent inner and outer names respectively. When any of the elements of an interface is omitted it means it is either equal to 0 (when interface lacks an ordinal) or it is empty (when there is no set of names). For example, interface I = m means it has no inner names.
An example of graphical representation of a bigraph is presented in Figure 1. Reaction rules are used to model dynamics in bigraphical systems. In this paper, we will use (simplified) tracking reaction rules. Reaction rule consists of a pattern (redex) to be found in an input bigraph that shall be replaced with another bigraph (reactum).
Formally, a tracking reaction rule is a quadruple: where: • B redex -a bigraph called redex; • B reactum -a bigraph called reactum; • η : m → m-a map between sites from reactum to sites in redex; • τ : V reactum → V redex -a map of reactum's node identifiers onto redex's node identifiers. It allows one to indicate which elements of an input bigraph are "residues" in an output bigraph.
Bigraphical Reactive System (BRS) is a tuple (B, R) where B denotes a set of bigraphs with empty inner face and R is a set of reaction rules defined over B. If R consists of rules with tracking then a pair (B, R) makes a Tracking Bigraphical Reactive System (TBRS).
Having a TBRS we can generate a Tracking Transition System (TTS). A Tracking Transition System is a 7-tuple: L T = (Agt, Red, Lab, Apl, Par, Res, Tra) where: • Agt-a set of bigraphs; Lab-a set of labels; • Apl ⊆ Agt × Lab-an applicability relation; • Par : V V b r r ∈ Red, b ∈ Agt-a participation function. It indicates which vertices in an input bigraph correspond to elements in the redex of a transition; Agt-a residue function. It maps vertices in an output bigraph that are residue of an input bigraph to the vertices in the input bigraph; • Tra ⊆ Apl × Agt × Par × Res-a transition relation.
As we said at the beginning of this section, we will use a simple example to illustrate how the formal definitions can be used in practice. The system for our example consists of two areas and two agents (we do not care whether they are humans, robots, or other autonomous entities). Areas will be denoted by controls A and B while agents will be represented with controls U. We assume that agents can move from an area of type A to an area of type B in two ways, which differ in execution speed. Thus Tracking Bigraphical Reactive System of the system above consists of three bigraphs and two reaction rules. The elements of B set are described in Table 1 and the reaction rules are defined in Table 2. The Tracking Transition System of this TBRS is defined in Table 3.

Graphical Representation
Name Description s0 The initial state of the system.

s1
The state where only one of the agents has moved to the B area. s2 The state where both agents has moved to the B area. Table 2. Elements of the R set for the introductory example. The η function for the first rule and both τ functions are identities. The first rule represents an action that allows a single agent to move between areas. The second rule is for an action where two agents move both at once. The second rule is only reasonable if underlying mechanism differs to that of the first rule.

State Space
Having a Tracking Transition System we can transform it into a state space of the modeled system. A state space can be later used to generate a behavior policy for agents (as defined in Section 2.1) in the system.
We assume the following about modeled systems:

1.
A number of passive and active objects is constant during whole mission; 2.
A system cannot change its state without an explicit action of an agent (alone or in cooperation with other agents); 3.
No actions performed by agents are subject to uncertainty; 4.
A mission can end for each agent separately in different moments. In other words, agents do not have to finish their part of the mission all at the same time; 5.
In case of actions involving multiple objects (whether these are active or passive), it is required of all participants to start cooperation at the same moment.
For the rest of the elements of C set the + symbol serves only as an associative conjunction operator and does not denote any meaningful operation. In other words for the rest of the elements the following formula is true: Going back to our introductory example, we will now convert the Tracking Transition System from Table 3 into a state space of the system. We will not define all of the formal elements and rather focus on the key ones. The S consists of three elements S = {0, 1, 2} that correspond to bigraphs s0,s1 and s2 respectively. The L consists of two elements that correspond to reaction rules in TBRS i.e., L = {r1, r2}. Knowing that there are only two agents in the system (so there are two objects in total) elements of the set I will be of the form i 1 , x , i 2 , y . The elements i 1 , i 2 of a tuple correspond to identifiers of objects (in this case i 1 , i 2 ∈ {1, 2}) and x and y elements indicate a moment of time at which each object is at. We will clarify how to utilize the C set in the next subsection. As it was mentioned earlier, the action represented by the r 1 reaction rule takes 2 units of time while the r 2 reaction takes only 1 unit of time. How these values are obtained depends on a project and may be subject to many factors such as resolution of time need to be considered (whether these are minutes, seconds or hours) or variability (or lack thereof) of time needed to execute actions represented by reaction rules. Knowing this, the elements of the T set are listed in Table 4. Subsequent elements of this set correspond to transitions in TTS. The permutation being a result of application of a transition function corresponds to permutation of vertices corresponding to objects in res function. It is also worth noting that f 3 function requires both agents to be at the same time (variable z) in order to return something other than 0. Table 4. Mission progress function definitions for the state space presented in Figure 2. The action represented by r 1 reaction rule is assumed to take 2 units of time while the action r 2 takes only 1 unit of time.

Function
Function Definition  Table 3. Mission progress functions definitions are defined in Table 4.

Behavior Policy
We define a behavior policy as a schedule of actions for each object from the beginning of a mission to its end that meets all the requirements listed in Section 2.1.
Having a state space, we can view a behavior policy as a walk (in graph theory sense) indicating what changes (and who did them) are required in order to reach a desired state.
To construct a proper policy behavior based on a state space, we need to define the following elements. Please note that by series we will understand a finite sum of elements.
where summands are mission courses leading to the state s; • N K (K t s ) ∈ N-a function returning a number of elements in a given series. According to the earlier definition, for any series K t s this function returns a value of m (the greatest index of c i ); , · · · , n s − 1}, t ∈ N-a series, whose summands are mission progress functions from the i to the j state; -a matrix whose elements are series indicating possible walks leading to each state. Index t denotes a number of steps made in a state space. By a step we understand a transition between vertices (including the situation where traversal does not change the vertex);  -a matrix of transitions between states.
Furthermore, we define two operations: F -a multiplication of the matrices defined above. Elements of the new matrix are defined by the formula: In order to generate all walks consisting of a specified number of steps from an initial state to a final state one must define the initial state, as a M 0 K matrix and multiply subsequent results by M i F the specified number of times. The result will be a M x K matrix, whose summands in the ith column will indicate all possible walks with x steps that end in the ith state of the state space. If the element in the specified column is equal to 0, it means there is no such walk.
Summarizing our introductory example, we will demonstrate how to use the state space from Figure 2 with transition functions definitions listed in Table 4 to determine all sequences of actions that lead to the state denoted as s2. Each sequence is equivalent to behavior policy that, when applied, results in moving both agents to the area of type B.
To determine such sequences, we create two matrices, a matrix of transitions M t F and matrix of initial state M 0 K . Having both of them, we can multiply subsequent M t K matrices by corresponding M t F matrices and check whether the third state (recall that numbering starts from 0) is reachable. By reachable, we understand having a value other than 0 in the specified column of the M t K matrix. Definitions of both matrices are listed below: (2,0) tuple in the first column of M 0 K matrix denotes that we have two objects. The zeros in both tuples indicate that each object starts the mission at the same moment. Subsequent M t K matrices allow us to determine how a system changes when a specified number of actions occur. For example, M 1 K gives us information about how the system evolves when one action occurs (analogously M 2 K for two actions etc.).
In our example M 1 K and M 2 K are of the form: The interpretation of each of the above M t K matrices is as follows. The M 1 K matrix indicates that with just one action there are two ways for the system to be in the state where one of the agents move to the area of type B and the other one will not take any action (as it is pointed out by the fact that its time is equal to 0). Both ways require specified agent to carry out the action represented by the r1 rule. The same matrix also gives us information that with one action there is a possibility to reach s2 state if both agents engage in cooperative execution of r2 rule. Finally, the M 2 K points out two walks in the state space that lead to the s2 state. Both involve performing the action associated with r1 rule two times (each time by a different agent).
It is worth pointing out that in a software implementation of the above algorithm labels should denote specific transition functions rather than reaction rules. While for this particular example it was sufficient to indicate what "kind" of changes (i.e., reaction rules) need to occur in the system for automated generation of behavior policies it is necessary to distinguish exactly what transformation (including who participated in a specific transformation) is required.
For more detailed examples we refer to [36].

Verification and Visualization of Behavior Policies
Below we will describe the algorithm to verify and illustrate the behavior policy. It consists of 4 phases. At the beginning of the discussion about each phase formal elements not introduced so far will be defined. Subsequent phases will be discussed so that newly introduced definitions will be directly used in the discussed phase. A diagram of relationships between phases is presented in Figure 3, from which it can be seen that the implementation of all the other phases is necessary for the execution of Phase 1. In contrast, Phases 4 and 2 are independent of the others.

2.
Option 2-the model is incorrect: • Information about the failed transformation. Whether the given reaction rule could not be applied to the state at the previous point in time or to the currently constructed state (given the mappings of unique identifiers to vertices).
Formal definitions: • X ⊆ N-a set of unique identifiers (UIs) of task elements; It is used to track the environment and objects involved between system transformations. The idea behind this set is to assign to each task element a unique identifier, which makes it possible to check whether the task elements marked as taking part in a reaction rule are present in a given scenario state. The reaction rules themselves allow only to check whether alike (rather than the same) elements exist in both a reaction rule and a bigraph. • Corr Red : R → Red-a function that assigns reaction rules to their corresponding redexes; Red-a set of functions assigning unique identifiers to elements of the support of a bigraph, which is either a scenario state or a redex of a reaction rule; • IsU pdatePossible : Agt × M x × Red × M x → {true, f alse}-a function that determines whether it is possible to apply a reaction rule to a given state, taking into account the mapping of the UIs to the state's vertices and the mapping of the UIs to the redex vertices of that rule; The flowchart of the Phase 4 algorithm is shown in Scheme 1. The input arguments of this algorithm and its results are described in Tables 5 and 6 respectively. Mapping of UIs to redex r vertices s 0 ∈ Agt State at the previous moment in time Mapping of UIs to vertices of s 0 n x ∈ X The first new UI Reaction rule with UIs mapping to its redex vertices, which was not successfully applied.
Formal definitions: • A ⊂ 2 N -A collection of sets of mission object identifiers. The same identifiers are used in SAT configurations N)-an extended walk consisting of: 1. A positional number; 2.
A map of UIs to redex vertices. The redex is associated with the reaction rule corresponding to the above transition function; 4.
A map of UIs to vertices of the output state of the extended walk element; 5.
First new UI assigned to a new task element created by applying the reaction rule (useful only if the reaction rule corresponding to the transition function creates new environment elements); 6.
A set of object identifiers involved in the walk element along with the duration of that transformation. In other words, it is information about which objects are involved in the transformation represented by the walk element and how long it will take.
• < W M -linear order relation on the elements of the extended walk. We will assume the following rule for ordering the elements of a walk: Objects F : I × N → A-a function that determines for which objects activities are scheduled later than the moment of time for which the scenario state is constructed. Takes a SAT configuration and the moment of time for which the state is generated; • Corr R : Tra → R-a function assigning reaction rules to transitions from TTS. The flowchart of Phase 3 of the algorithm is shown in Scheme 2. The input arguments for the algorithm are described in Table 7. The auxiliary variables, some of which are also outcomes of Phase 3, are described in Table 8. The outcome of Phase 3 is described in Table 9.

Variable Description
Mapping of UIs to vertices of s c . i c ∈ I SAT configuration of the currently constructed state. The initial value is i 0 .

A o ∈ A
A set of object identifiers, skipped in the constructed state. The initial value is the empty set. W c ⊆ W A collection of usable walk elements. The initial value is W. W o ⊆ W A collection of unused walk elements. The initial value is the empty set. Table 9. Output data of Phase 3 algorithm.

Variable Description
W o ⊂ W Unused walk elements that will be used to construct subsequent scenario states.
Mapping of UIs to vertices of s c . i c ∈ I SAT configuration at time d.
Noteworthy are the conditions checked in the subsequent steps of Phase 3 of the algorithm. Comments for each of them are given below.

1.
The first condition checked is if we have reached the end of a walk. If so, then surely the state currently constructed is the state for the given moment of time.

2.
Do we omit actions of all mission objects? If so, the state constructed so far is the state for the given moment of time.

3.
Do any objects involved in the current action belong to the set of skipped objects? If so, we omit this walk element.

4.
Will all objects involved in the current action have finished before the moment d? If not, we disregard that activity in the currently constructed state and add those objects to the set of skipped objects.

5.
If Phase 4 is not completed correctly, it means that the model is incorrect.

Phase 4 ends with error
Phase 4 ends without error Scheme 2. Flowchart of the Phase 3 algorithm. The goal of this phase is to construct the state of a scenario at a given point in time. This phase runs in a loop until there are no available walk elements or when an execution of Phase 4 ends with an error. It takes subsequent elements of the input walk and updates both the current SAT configuration and a scenario state. If the mission objects will not have finished the activity represented by the currently processed walk element before or at the specified moment of time then the SAT configuration and state updates are not performed. The same thing happens if an activity involves objects participating in other activities that would end in a future and that have already been skipped.

Phase 2-Extending a Previously Constructed Walk
Phase 2 of the algorithm is responsible for extending a walk to the form acceptable by Phase 3. Input: • A walk resulting from the algorithm presented in Section 2.4; • Number of objects. Output: • Extended walk.
Formal definitions: • W ⊂ N × T-a walk. The first element denotes the positional number of the transition function that is the second element of the tuple; • < W -linear order relation on the elements of the set W.
As in the case of the set W M , we define the order relation by the following rule: First : 2 W → W × 2 W -a function that returns the "smallest" walk element and a truncated walk; The flowchart of the Phase 2 algorithm is shown in Scheme 3. Input arguments are described in Table 10; auxiliary variables and the result of this phase are discussed in Table 11.

Variable
Description n x ∈ X The value of a first new UI. The initial value is the number of vertices of the input state of the first walk element. m f ull ∈ M x The current UIs mapping to the vertices of the last processed output state. The initial value is a function that assigns consecutive natural numbers to the vertices of the input state of the first element of the walk. W r ⊆ W M Elements of the extended walk. The initial value is the empty set. This is the result of this phase. W c ⊆ W A subset of walk elements that have not been processed yet. The initial value is W. i c ∈ I Current SAT configuration. The initial value is ((1, 0), . . . , (n o , 0)).
Flowchart of the Phase 2 algorithm. The goal of Phase 2 is to expand each element of a provided walk to the form acceptable by Phase 3. Each element of the walk is coupled with the duration of its corresponding activity along with the identifiers of the objects (not unique identifiers of task elements) that participate in the activity and two bijections. The first function maps unique identifiers to vertices of the redex of the reaction rule associated with the currently processed walk element. With this function, we know exactly who is participating in the activity. The second function maps unique identifiers to the output state of a processed TTS transition (derived from the walk element). With this function, we know exactly which task element corresponds to which vertex after applying the reaction rule. The second function is used in the next iteration of Phase 2. The UIs mapping of the element that could not be transformed and the redex of the above reaction rule.

Phase 1-Constructing All Scenario States and Checking the Correctness of a
Phase 1 input parameters are described in Table 12. The auxiliary variables along with the outcome are discussed in Table 13. The flowchart of the Phase 1 algorithm is shown in the Scheme 4.  Mapping of UIs to vertices of the state s. The initial value is a bijection of consecutive natural numbers on the vertices of s. i s ∈ I SAT configuration for the scenario state at the time d − 1. The initial value is ((1, 0), . . . , (n o , 0)).
A collection of states at successive moments in time with their corresponding UIs mapping and SAT configurations. The initial value is the empty set. This is the result of this phase. Start

Results
This section will provide example use cases of the algorithm discussed in the previous section. The first two examples show in detail how the algorithm detects errors in a model and how it constructs successive scenario states. The next examples present how to check the fulfillment of non-functional requirements for systems designed with our methodology. Finally, the problem of memory complexity of convolution operation performed during a construction of walks in a state space is discussed. We also provide a few propositions how to address this issue.

Introduction
The first example will demonstrate how the algorithm can detect that a system is incorrectly designed.
A task (as defined in Section 2.1) for this example consists of six elements, two actions that can be performed, and one goal. The task elements comprise three areas with two robots and an object to be carried between the areas. The goal of the task is for the robots (denoted by vertices with the control B) to move the object (denoted by a vertex with the control O) from the area AT1 to the area AT3. The initial state of this system is shown in Figure 4. We will use two reaction rules to generate a tracking bigraphical reactive system: mov1 and mov2 depicted in Figure 5a,b, respectively. The elements of a tracking transition system for this example are shown in Table 14. If we categorize the task elements as presented in Table 15 then we can transform the  TTS from Table 14 into the state space as in Figure 6. However, this will not be a valid state space because no time is taken into account for the object being moved (i.e., it is not treated as a passive or active object as defined in Section 2.1).

Using the Algorithm for Model Verification
Walk S 0 Assuming that both actions associated to the reaction rules take 1 unit of time to complete, in Phase 2 both elements of set W will be transformed to form: (1, The method of constructing m r and m f ull functions that result from Trans function in Phase 2 is shown below.
The rule of constructing m r function: where par −1 is the inverse function to par being an element of t tra . In this case, the functions f 1 , f 2 , . . . , f 8 correspond to the subsequent rows in Table 14.
The rule of constructing m f ull function: res −1 is the inverse function of res which is an element of t tra . Table 16 lists the successive steps of the algorithm that will lead to a detection of an error in the model. The reason why this model is incorrect is not because the redex of the rule mov2 is not in the 0 state but because the moved object is categorized as an element of the environment, thus we do not take into account the passage of time for it. As a result, the reaction rules create the appearance of being independent of each other when in fact the execution of mov2 rule is dependent on the execution of the rule mov1. To fix the model, the relocated object needs to be categorized as a passive object and one need to add a reaction rule allowing a robot that is in AT3 area to wait until the object being moved is in AT2 area.   Table 15. Categorization of task elements for the first example. Note that this produces an incorrect model because the moved object is considered an environment element.

Category of Task Elements Elements Belonging to the Category
Environment {AT1, AT2, AT3, O} Passive objects ∅ Active objects (agents) {B} Figure 6. Incorrect state space for the task from the first example. Table 16. Subsequent steps of the algorithm in the model validation example.

Introduction
The second example will demonstrate the problem of visualizing a scenario and how our algorithm can help in solving it. A task for this example is composed of three areas and two robots of the same type. The initial state of the system is presented in Figure 7. The tracking bigraphical reactive system for the purpose of this example consists of two reaction rules, r1 and r2, shown in Figure 8a,b, respectively. The goal of the task is to move the two robots from the area AT1 to the area AT3. Tracking Transition System generated from this TBRS is defined in Table 17.  The tracking Transition System from Table 17 can be transformed into a state space as in Figure 9. Now, suppose that a walk chosen for the behavior policy is of the form: The above walk never "passes" through a state where both robots are in AT2 area. (that is, through the state S 3 ). Such a situation must occur for the following reasons. For a walk representing four activities (because it consists of four arcs), that can correspond only to reaction rules from Figure 8 a course of a mission for each robot must take the form of moving from an AT1 area to an AT2 area and then from AT2 to AT3 area. Since the activities represented by the reaction rules are not cooperative (each of the reaction rules involve only one agent) the movements will be performed in parallel. We also know that the time required to perform both activities will be the same for both agents (because agents are of the same type and perform the same type of activity) so the successive movements will end at the same moment. Because of all that, during a mission there must occur a situation where both robots are at an AT2 area at the same time. Therefore, the algorithm for constructing subsequent scenario states must be able to construct states that are not "on" a provided walk.

Using the Algorithm to Construct Scenario States
The walk S 0 − → S 5 can be presented as: Assuming that execution of each reaction rules takes one unit of time, in Phase 2 the consecutive elements of set W will be transformed to the following form: Steps of the algorithm to construct the subsequent scenario states are presented in Table 18. Table 18. Successive steps of the algorithm in the example of visualizing a scenario.

Phase
Step  Table 18. Successive steps of the algorithm in the example of visualizing a scenario.

Phase
Step Result/Comment

Phase
Step Result/Comment End-ok Table 18. Cont.

Phase
Step Result/Comment The result of the algorithm contains a state where both robots are in AT2 area despite the fact that the walk has not passed through such state. A Gantt diagram for this scenario is shown in Figure 10. Both robots are performing actions r1 and r2 in parallel. 1 End-ok The result of the algorithm contains a state where both robots are in AT2 area despite the fact that the walk has not passed through such state. A Gantt diagram for this scenario is shown in Figure 10. Both robots are performing actions r1 and r2 in parallel.
O2 r1 r2 Figure 10. A Gantt diagram for the scenario from the second example. Activities marked as t in the row preceded by Ox denote involvement of the element x (x is the unique identifier of a task element given at its first appearance or at the beginning of a scenario) during the activity t. Only elements that are active objects are included in the diagram.
Functions tupled with each state allow to "track" task elements between states. For example, the function m s = (0, 0), (1, 1), (2, 2), (3,3), (4,4) for the state at time 0 indicates that the object tagged with the unique identifier 2 (the argument of m s function) is represented by the vertex with identifier 2 (the value of m s function for argument 2). The support of a bigraph itself does not track its elements between transitions, as can be seen by comparing the state of the system at time 0 and time 1. For example, knowing that there is one area of each type, we have no doubt that a vertex with the control AT1 represents the same object in both states even though the support element assigned to each vertex is different between states. However, we do not have such certainty for vertices with controls of the type B. Unique identifiers point to unique objects between states, even if those objects have changed the controls representing them.
Here is an example based on the elements of set S r from Table 18 how to use a unique identifier mapping. For the state at time 1, the UI with the value of 3 points to the vertex with identifier 1. This means that it is the same task element that in the state at time 0 is represented by the vertex with identifier value of 3 and the same element that at time 2 is represented by the vertex with support 0.

Example of Verifying the Fulfillment of Non-Functional Requirements
The last example is intended to demonstrate how non-functional requirements can be defined for systems designed using our methodology and determine whether these requirements have been satisfied.
For this example, we will define a task of relocating items in a warehouse. The goal of this task is for two robots to deploy items of different types from the warehouse to unloading areas. The initial state of the task is depicted in Figure 11. The interpretation of each control is shown in Table 19. Six reaction rules are defined for this system; all of them are listed and described in Table 20. For this example, the graphical representation of reaction rules is omitted because it will not be relevant. Figure 11. The initial state of a system in the example of checking whether non-functional requirements are met. Table 19. Interpretation of controls in the example of checking whether non-functional requirements are satisfied.

Control Real World Object
Warehouse area-robots can move between them. B Beacon-indicates the warehouse area where robots should return after relocating objects. M Warehouse-it stores objects to be moved.

OT1
Object of type 1 OT2 Object of type 2 DT1 Type 1 unloading area-the location where objects of type 1 are to be relocated.

DT2
Type 2 unloading area-the location where objects of type 2 are to be relocated. A robot retrieves a type 2 object from the warehouse. 2 set1 A robot deposits a type 1 object into an unloading area. 2 set2 A robot deposits a type 2 object into an unloading area. 2 The state space for the system consists of 666 states (vertices) and 5325 transitions (arcs). Due to the size of this example, the graphical representation of the state space and elements of the tracking transition system will not be presented. It is worth discussing here the increase in the size of a state space as the number of system elements increases. If one were to expand the current system to three robots, two type 1 objects, and three type 2 objects, the number of states increases to 5765 and the number of transitions to 70,701. Such a significant increase in the size of a system suggests that it is reasonable to consider ways of limited construction of a state space that will remain useful in later stages of the development of behavior policies.
Moving on to behavior policies for the agents in the task above. First walks solving the task are 15 steps long. However, these solutions are using only one robot, as can be observed in the action schedule presented in Figure 12. A mission performed using behavior policy based on such a walk takes 21 units of time .   Time   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22 O1 mov mov get1 mov set1 mov get2 mov set2 mov get2 mov set2 mov mov

Non-Functional Requirement-Length of a Mission
Now let us assume that one of the non-functional requirements imposed on the task is to limit the length of a mission to the maximum of 20 units of time. There is no walk of the length 15 that satisfies this requirement. Knowing that the current solutions use only one robot we can try to improve them by extending the walks to 18 steps. This way the second robot can move one of the objects to an unloading area. A schedule of actions constructed with a walk of 18 steps is presented in Figure 13.  It is important to note that simply lengthening a walk does not guarantee an improved result. For example, if the walk underlying the schedule in Figure 12 is lengthened by three transition functions, all corresponding to the reaction rule stay, it will not yield any improvement in the quality of a solution.
Checking whether non-functional requirements are fulfilled should be done in Phase 1 after Phase 3 has been successfully completed. This step is not shown on Scheme 4 but this is how it was implemented in [39].

Non-Functional Requirement-Collision Avoidance
Another example of a non-functional requirement will be related to safety of mission execution. This time we impose a requirement that there should be no collisions between robots that are in the process of moving objects.
One of the advantages of using bigraphs is that they allow one to define patterns to be found in other bigraphs. These patterns are of "minimal satisfying phenomenon" type. One cannot define an "all but" type pattern in bigraph notation. In other words, you can define a pattern like "minimum three people in a room" but you cannot define a (single) pattern that detects "less than three people in a room".
Let us assume that a collision-free mission will be guaranteed if the robots moving the objects are not in the same area. Such a requirement can be defined as "if two robots, at least one of which is moving an object, are in the same area then the scenario is unacceptable". Bigraph patterns able to detect such a situation are shown in Figure 14.
Identical to the previous non-functional requirement, this requirement can be verified in Phase 1 after a successful completion of the Phase 3.
(a) (b) Figure 14. Bigraph patterns to detect whether a collision between robots may occur during a scenario. The two patterns differ only in type of the relocated object. (a) The first pattern. (b) The second pattern.

Memory Complexity
As we have already mentioned, the size of a system grows much faster than the number of task elements. The same is true for the memory complexity of matrix multiplication operations described in Section 2.4. We have tested how limiting the number of results of convolution operation affects memory usage of the tool [40]. All measurements were done using multi-threaded F# implementation on a PC with 64 bit Ubuntu 20.04 operating system installed and the previous example regarding non-functional requirements was used for testing. We carried out three different tests reducing the number of results to 500, 10,000, and leaving the number of results unlimited. In the first case, the peak memory usage was about 700 MB before walks of the length of 15 arcs were found. The second case resulted in memory consumption around 15 GB before similar walks were found. The last case did not succeed on a machine with 64GB of RAM.
To deal with the memory complexity, we propose three methods to reduce the number of results: • First N-a result of the convolution operation performed during a matrix multiplication is limited to the first N results. This way of searching for behavior policies is suitable when the first results found satisfy non-functional requirements; • Best N-a result of the convolution operation is constrained to the N best results evaluated using an evaluation function (discussed below). This method of searching for walks is useful when a desired walk should have a certain length; • All-the result of a convolution operation is not constrained in any way. Useful only for small systems to verify model correctness.
In the case of best N method, there is a need is to define an evaluation function for partial solutions. We propose a SAT configuration evaluation function based on the involvement of task objects. The evaluation function returns a higher score the more objects are involved equally. The formula for calculating the evaluation function value can be expressed as below: The largest engagement of any object. Table 21 shows the values of the proposed evaluation function for a few example SAT configurations.  The prepared tool [40] for walk construction offers six strategies for finding solutions: • All first found-Returns all walks leading to the goal state with the shortest length; • First N found-returns all walks leading to the target state. The matrix multiplication operation is constrained by first N method; • First N best found-returns all walks leading to the target state. The matrix multiplication operation is constrained by best N method; • All up to a certain length-returns all walks leading to the target state of a length no greater than a given value; • First N up to a certain length-returns all walks leading to the target state with a length no greater than a given value. The matrix multiplication operation to find walks in a state space is constrained by first N method which results in each set of walks of the same length being allowed to have a count of at most N elements; • Best N up to a certain length-returns all walks leading to the target state with a length no greater than a given value. The matrix multiplication operation to find walks in a state space is constrained by best N method which results in each set of walks of the same length being allowed to have a count of at most N elements.
We summarized all of the above strategies in Table 22.

MNoR Pros Cons
All first found Unlimited Perfect for assuring correctness of the model as this strategy gives all existing walks to the desired destination state.
Unfeasible for anything but small systems due to large memory consumption.
First N found N The fastest of all strategies since it does not sort results and can shrink an output of convolution operation. Perfect when the quality of a result is not important or when all results are expected to have similar quality.
Does not care about quality of returned results at all.
First N best found N With a good evaluation function this strategy can return the best results. Perfect when model has already been validated and the developer is looking for a behavior policy of a certain quality.
Slower than first N found since results are sorted with an evaluation function.
All up to a certain length Unlimited Gives a glimpse of how the length of a walk impacts the way a mission is executed. Since it is an extension of all first found it allows for throughout correctness testing.
Only for tiny systems. This is the most memory consuming strategy because it not only returns all found results but the search is continued until results have specified length. First N up to a certain length N × L Allow for insight into how the length of a result impacts the way a mission is executed. Very fast as it is an extension of first N found.
Does not care about the quality of returned results at all.
Best N up to a certain length N × L It gives good insight how the quality of results varies with the length of a walk. Perfect when the developer is looking for a behavior policy that he or she has no expectations about.
It is slower than first N found up to a certain length strategy due to sorting of results.

Discussion
In this paper, we presented an algorithm to verify multi-agent system models based on tracking bigraphical reactive systems. Our algorithm can detect incorrectness of a model and unfulfillment of non-functional requirements. The algorithm considers a model to be incorrect if activities planned to be executed in parallel are not independent of each other. In this article, we presented two examples of utilizing the algorithm to check if a behavior policy meets non-functional requirements regarding time and safety of task execution. We also demonstrated how to generate successive states of a scenario, which is a task realization using a selected behavior policy, based on the the behavior policy. Finally, we discussed memory complexity of operations essential to behavior policies generation and proposed a few ways to reduce it. One of the suggested methods is to limit results to a certain number of the best ones. We gave an example of an evaluation function that allows ranking partial results (in our case, these are behavior policies that when executed do not meet functional requirements). The evaluation function is applicable to tasks of any kind and size. This work complements our previous publication, which focused solely on designing multi-agent systems with tracking bigraphs. The methodology enables the design of a broad range of systems from warehouse robots to drone swarms performing a task without human intervention. One can also consider designing software systems where programs act as agents and operations performed by these programs could represent transition functions. The functional programming paradigm intuitively fits this kind of design.
The main drawback of our methodology is the lack of adaptability of behavior policies. This means there can be no deviation from scheduled actions when executing a behavior policy. It also means that agents in a modeled system have to be fully controllable in the real world. The biggest drawback of the algorithm presented in this article is that it verifies the correctness of a model looking for errors in a single behavior policy. Thus, the more behavior policies that are checked, the more confident we are that the model is correct.
As for directions of further development, the primary goal should be to improve the generation speed of tracking reactive systems as it is the main limitation of the methodology right now. One way to achieve it is to develop a method of partial construction of a tracking bigraphical reactive system that consists of bigraphs necessary to manufacture a good quality walk in state space. If the method of reducing the number of states is automatic, i.e., it will not require the designer to specify bigraphical patterns, it is going to significantly speed up the development of behavior policies. Right now our method can only be applied to relatively small systems because the explosion of states makes it impossible to efficiently search for walks in the state space of a modeled system. Author Contributions: Conceptualization, P.C. and Z.Z.; methodology, P.C. and Z.Z.; software, P.C.; validation, P.C.; formal analysis, P.C.; investigation, P.C.; resources, P.C.; data curation, P.C.; writingoriginal draft preparation, P.C.; writing-review and editing, Z.Z.; visualization, P.C.; supervision, Z.Z.; project administration, Z.Z. Both authors have read and agreed to the published version of the manuscript.