1. Introduction
The mutual exclusion problem [
1,
2,
3] arises each time
concurrent/parallel processes (or threads) have to access in read/write to a shared resource
. For predictability of
evolution, only one process at a time must be allowed to use
. The code segment executed by a process when using
constitutes its
(CS). Several algorithms (protocols) have been proposed in the literature for solving the mutual exclusion problem. In this paper, the interest is on algorithms that are based on software mechanisms only, that is, without the support of special hardware instructions like
test-and-set. Each particular protocol (see Algorithm 1) consists of a (hopefully minimal) number of
(registers), suitably initialized, and a
section where the process signals its interest in accessing the resource and starts competing with peers. The process remains (waits) in the prologue until the conditions on the shared variables permit it to enter its critical section. Waiting is normally achieved by
loops, with the process actively checking selected variables until an awake situation is sensed which enables it to exit from the busy–waiting. After the critical section, an
section is executed, which resets registers, thus allowing other competing processes to possibly resume movement toward their critical section. In the non-critical section (NCS) the process executes actions independent from the resource
. For generality, in the NCS the process can also terminate its execution.
Algorithm 1. Abstract structure of a process using a mutual exclusion protocol. |
global shared communication variable : local variables |
repeat |
NCS Prologue CS Epilogue | //Non Critical Section //Enter section of the protocol //Critical Section //Exit section of the protocol |
forever |
The first “pure” software solution to the mutual exclusion problem was due to Dekker [
4,
5] and was concerned with the handling of
processes. Dijkstra proposed a solution [
6] for
processes, although with a possible unbounded waiting time for a competing process. Knuth [
7], de Bruijn [
8], and Eisenberg and McGuire [
9], respectively, developed improved solutions to the Dijkstra’s algorithm to ensure a bounded waiting time for a competing process, possibly linear in
, e.g., [
9]. A more simple and elegant solution than Dekker’s algorithm for
processes and a general solution for
processes were proposed by Peterson in [
10]. All such algorithms were mainly studied using atomic registers and through informal mathematical reasoning. Atomic registers have indivisible read/write operations. That is, it is not possible that such operations can overlap on the same register. In addition, informal reasoning has great difficulty in assessing the general properties of a mutual exclusion algorithm. This is due to the intrinsic non-determinism or interleaving that characterizes the execution order of actions of concurrent/parallel processes, which almost always is impossible to predict. This can explain, for example, the different conjectures about the waiting time of a competing process in Peterson’s algorithm for
processes [
11].
As observed by Lamport in [
1], studying the correctness of a mutual exclusion algorithm by assuming atomic read/write operations on registers is a way of achieving mutual exclusion by relying on a lower-level mechanism of mutual exclusion. On the other hand, the widespread diffusion of multi-port memory devices [
12] (e.g., cell phones and similar low-cost hardware devices) mandates to adopt mutual exclusion solutions that can tolerate multiple simultaneous read/write operations on the same register. Therefore, the more general context where to check properties of mutual exclusion algorithms is that of working with non-atomic registers and their associated consistency rules.
This paper significantly extends previous authors’ work [
11,
13,
14,
15] aimed at the development of a formal method for modeling and verification of mutual exclusion algorithms based on timed automata [
16] and model checking [
17,
18], using the Uppaal toolbox [
19,
20,
21] for the practical experiments. Previous work mainly addressed the correctness assessment of mutual exclusion solutions using atomic registers and a preliminary technique for dealing with non-atomic registers. The original contribution of the current paper is to propose more general techniques to ensure the consistency of non-atomic registers [
22]. The new techniques are rendered in the Uppaal modeling method, and their practical usefulness is demonstrated by modeling examples. The new proposed techniques address specifically non-atomic registers where multiple writers and multiple readers (MWMR) can overlap their execution on the same register, or a single writer and multiple readers (SWMR) can access a register simultaneously. The MWMR case can cause the value of the register to be
scrambled [
22,
23], that is, writing in it a non-deterministic value belonging to the register type domain, which can be acquired by readers. The SWMR case is the source of the
flickering phenomenon [
22,
23], where each distinct overlapping reader can obtain a non-deterministic valuation of the register value.
Different mutual exclusion solutions are considered for modeling and correctness assessment. As an example, the paper shows that Szymanski’s algorithm proposed in [
24] is correct under atomic registers but becomes incorrect when using non-atomic registers. In addition, the improved solution reported in [
25], designed to ensure a linear waiting time for a competing process and which admits a reduced shared data space, effectively guarantees a linear
(the number of times competing processes can bypass a given waiting process and precede it in entering their critical section), but, for
processes, it does not fulfill mutual exclusion properties even with atomic registers. From the algorithm in [
25], a version for
(S2) processes is derived in this paper, and its correctness is demonstrated using either atomic or non-atomic registers. Both the Peterson for
processes (P2) [
10] and the Szymanski algorithm for
processes are then exploited, separately, as the arbitration unit in a standard and efficient tournament tree (TT)-based solution [
11,
14,
26,
27], thus addressing
processes. As an original result of this paper, it is shown that whereas TT-P2 is incorrect when both scrambling and flickering of non-atomic registers are admitted, the TT-S2 solution, which can only be affected by flickering, represents a fully correct mutual exclusion algorithm also with non-atomic registers.
The paper compares the model checking efficiency of three proposed flickering techniques for non-atomic registers with the so-called technique, which is a novel proposal of this paper, that emerges as the best one for application generality, flexibility, and execution performance.
The formal modeling approach developed in this paper is similar to the process algebra method based on the mCRL2 model checker [
28], described in [
22]. However, this paper argues that the presented approach is more intuitive to use because it rests on processes modeled graphically by timed automata, which are more amenable to understanding and preliminary debugging activities. Visual modeling and exhaustive model checking, that is, systematically investigating all the possible (hopefully finite) execution paths/states in a model, represent the strength of the developed formal method.
The paper is structured as follows.
Section 2 describes the consistency rules of non-atomic registers.
Section 3 presents the basic formal method, centered on Uppaal, for modeling and verification of mutual exclusion algorithms. A few timed computational tree logic (TCTL) queries are suggested for checking the properties of a mutual exclusion solution.
Section 4 proposes some techniques for modeling scrambling and flickering, respectively, in MWMR/SWMR non-atomic registers. The techniques are rendered in Uppaal to be easily integrated within the basic formal method described in
Section 3.
Section 4 illustrates the application of the MWMR/SWMR consistency techniques to different mutual exclusion algorithms, also in the context of the general and efficient tournament binary tree organization. The section compares the model checking efficiency of the three flickering techniques proposed in
Section 4.
Section 6 concludes the paper with an indication of ongoing and future work.
2. Register Consistency in Non-Atomic Memory
Shared communication variables (registers) used by a mutual exclusion protocol can (hopefully) be only
output variables [
29] (also called
exterior variables in [
26]). Each output variable is owned by a distinct process, which is the only one that can write a value on it. Output variables, though, can be consulted by all the other processes. Instead, a not-output variable can be read/written by any process. It can be anticipated that this is the most complex case that can occur.
Register consistency conditions arise when multiple operations can be executed simultaneously on the same register. The so-called Single Writer Multiple Readers (SWMR) case is the first scenario to be considered [
22,
30]. A register operated under SWMR is
safe if a read that occurs when there is no overlapping write returns the most recently written value, whereas a read that happens concurrently to a write operation may return (
flickering) a non-deterministic value belonging to the domain (type) of the register. The register is
regular, under SWMR, if it is safe, but, in addition, a read that occurs simultaneously to a write operation returns either the value of the overlapping write or the most recently written value. Finally, a register is
atomic under SWMR when the read and write operations never occur simultaneously but are executed according to a definite order.
The most complex scenario is the Multiple Writers Multiple Readers (MWMR) one. As in [
22,
23], the Raynal condition [
3] is assumed. Under MWMR, a register is
safe if multiple concurrent writes on the register cause
scrambling of its value, that is, a non-deterministic value belonging to the register type is actually written. Any read operation executed in overlapping with one or multiple writes returns a non-deterministic value belonging to the type of the register. The register is
regular; if a read occurring in overlapping with write operations, either returns the last value written to the register or the value of any overlapping write. Finally, an MWMR register is
atomic if read/write operations never overlap and a read operation always returns the value of the last executed write with respect to the moment the read was issued.
Of course, the use of only output variables requires a mutual exclusion algorithm to be correct under atomic registers and SWMR-safe or SWMR-regular non-atomic registers. When non-output variables are admitted, the atomic algorithm’s correctness has to be checked under the more challenging MWMR-safe or MWMR-regular non-atomic registers.
A specific contribution of this paper is a rendering of the non-atomic SWMR/MWMR registers in an Uppaal model. In particular, the approach starts by transforming a mutual exclusion algorithm into an Uppaal model tailored to only use atomic registers. Model correctness is then thoroughly assessed. After that, the model is modified according to the requirements of the intervening SWMR/MWMR scenarios, and the correctness is re-checked under safe or regular non-atomic registers.
3. Uppaal Modeling with Atomic Registers
The following summarizes the formal approach adopted in this work for modeling and verification by model checking of a mutual exclusion algorithm reduced onto the timed automata language of the Uppaal toolbox [
19,
20,
21]. The goal is to clarify the basic modeling infrastructure based on atomic registers, on top of which the operation of non-atomic registers will subsequently be embedded. The reader is referred to [
11,
13] for more details about the rationale of the transformation of a mutual exclusion solution into an Uppaal model.
A concurrent system with processes is represented by a network of Process() instances (timed automata), whose unique identifier ranges from 1 to . The processes execute the same algorithm and interact with one another through the shared communication variables used by the mutual exclusion protocol.
Under atomic modeling, any single access (read/write) to a shared communication variable constitutes an atomic action. Therefore, a boolean expression involving multiple shared variables has to be evaluated component-by-component. An exception to the above rule is represented by the management of local variables of a process. Multiple local variables, supposedly held in hardware registers, can be operated simultaneously.
Szymanski’s algorithm for
processes, proposed in [
24] (page 624, Figure 2), is chosen for demonstration purposes (see Algorithm 2). It is based on output variables only:
,
. The solution follows the metaphor of a waiting room with two doors:
and
. Initially, the
door_in is open and the
door_out is closed. Each process can be in one of 5 states (values
of
). State 0 means the process is not interested in the critical section. In state 1, the process wants to enter the waiting room. All the processes that want to enter the waiting room at the same time are allowed to pass through the
. In state 2, a process checks if other external processes want to enter. If no other processes are sensed, the process closes the
, passes to state 3, and the
is opened. From the waiting room, processes exit the
according to the order of their IDs. In state 4 the process exits the waiting room and enters the CS. When finishing with the CS, the process switches again to the state 0. At the time the last process exits from its CS, the initial door situation is re-installed.
Algorithm 2. The Szymanski’s mutual exclusion algorithm [24]. |
shared communication variables: int [0, 4] flag [1..N];//all 0 initially Process(i:1..N) local variables: j repeat NCS P10: flag[i] = 1; P11: await(); P20: flag[i]=3; P21: if(){ flag[i] = 2; P22: await()} P30: flag[i] = 4; P31: await(); CS E0: await(); E1: flag[i] = 0; forever |
The
await() condition at the E0 line takes into account the correction suggested in [
31] (page 675).
The Algorithm 2 was analyzed informally by the author in [
24], and a linear overtaking factor (that is the maximal number of times a process in the waiting room will be by-passed by other competing processes) was predicted:
, as a direct consequence of enforcing an exit order from the waiting room directly dependent on the process IDs.
Figure 1 shows a transformation of Szymanski’s algorithm of the generic
in an Uppaal-timed automaton model made of
locations,
edges, and
clocks. Time can be controlled by
clock variables, which measure relative amounts of elapsed time. A clock can be reset. After a reset, the clock grows at the same rate as all the clocks in the model. Edges are annotated by
guarded commands. Guards (green-colored in
Figure 1) are logical conditions that enable an edge to be possibly taken. A command can contain a synchronization on an urgent broadcast (asynchronous) channel
(azure-colored) that constrains the edge to be exited as soon as the guard evaluates to true. A command can also have an update part (blue-colored), which consists of an ordered list of assignments and clock resets. Locations can be normal (see NCS, AWP11, and so forth) or urgent (flagged with an internal U). The automaton (
instance) can remain in a normal location for an arbitrary time. The dwell time in a normal location can be constrained by attaching to it an
invariant (see the CS location with the invariant
in
Figure 1). The process can remain in the location provided the invariant (a logical condition built on data or clocks) continues to be satisfied. An urgent location must be abandoned without time passage. Among the current urgent locations of a set of processes, Uppaal ensures they will be exited in a non-deterministic order. Therefore, for a proper reproduction of the concurrency of the actions of multiple processes, actions can be purposely attached to the edge commands exiting from urgent locations. This way action interleaving and partial order of a concurrent/parallel execution are naturally ensured. It is worth noting that exiting from a normal location with an enabled guard and urgent output synchronization (see, e.g., the AWP11 location in
Figure 1) is equivalent to exiting from an urgent location. Thus, non-determinism regulates the execution of all types of urgent actions.
The model in
Figure 1 is time-dependent in the following aspects. In the non-critical section (NCS), which is also the initial location, a process can remain an arbitrary time, also infinite. This way, process termination gets naturally modeled. In the critical section (CS) location, the dwell time is constrained to be exactly 1 time unit. This choice was adopted to measure the overtaking factor (
) of processes easily. Each process is associated with a clock
, which is reset just before entering the CS. A target process (
) can be established and its clock
observed for measuring the
. All the times the process
starts competing, its clock
is reset (through the
function). Then,
gets incremented as competing processes gain the CS before
. Finally, the maximum value of
is measured just before
enters its CS (see the
location in
Figure 1).
Busy–waiting situations are required in the P11, P22, and P31 points of the Prologue section and the E0 point of the Epilogue of the mutual exclusion protocol (see Algorithm 1). Whereas an internal spin-lock of a process running on a separate core can be practically tolerated, it should conveniently be avoided in an Uppaal model for verification efficiency (by reducing the number of distinct execution paths to be analyzed). The adopted solution consists of introducing a normal location (see AWP11, AWP22, AWP31, and AWE0 in
Figure 1) for each busy waiting. Instead of immediate looping, the exit from the location is controlled by a guard function like
in AWP11. This function checks “optimistically” (by possibly simultaneously consulting multiple shared variables) if a potential exit from the busy waiting location is possible. If yes, the urgent and asynchronous
signal is used to force the exiting without further time passage. However, following the exit from the busy waiting location, the effective evaluation of the await (possibly complex) condition is carried out, component by component. At any moment the condition for abandoning the busy–waiting would be found to be false, the busy–waiting location is immediately re-entered. Algorithm 3 collects all the try functions used in
Figure 1.
Algorithm 3. The try functions of the model in Figure 1. |
bool tryAll(){ int j; for(j = 1; j <= N; ++j) if(flag[j] > 2) return false; return true; }//tryAll | bool tryAny(){ int j; for(j = 1; j <= N; ++j) if(flag[j]==4) return true; return false; }//trayAny | bool tryInf(){ int j; for(j = 1; j < i; ++j) if(flag[j] > 1) return false; return true; }//tryInf | bool trySup(){ int j; for(j = i + 1; j <= N; ++j) if( flag[j]==2|| flag[j]==3) return false; return true; }//trySup |
3.1. System Configuration
Algorithm 4 collects the global declarations of the Uppaal model in
Figure 1.
Algorithm 4. Uppaal global declarations of the model in Figure 1. |
const int N = 4;//example typedef int[1, N] pid;//process IDs typedef int[0, 4] values;//possible values of flag[.] variables values flag[pid];//all zeros initially by default urgent broadcast chan synch; |
The automaton has only one parameter: , which establishes as the process identifier. instances of the model are automatically created at the system initialization time by using the following statement of system configuration:
3.2. Checking Algorithm’s Correctness
Correctness of the Algorithm 2 and the corresponding Uppaal model in
Figure 1 can be assessed through a few timed computational tree logic (TCTL) queries supported by Uppaal [
20]. First, the
state graph (timed transition system) of the model is built by the verifier, that is the set of all (hopefully finite) execution states and execution paths in the model. Then, specific properties of the mutual exclusion solution are investigated as follows.
3.2.1. Mutual Exclusion Property
The following query asks if it is always true that in all the states of the state graph the number of processes simultaneously found in their critical section, is less than or equal to 1:
3.2.2. Absence of Deadlocks
No state of the state graph is deadlocked provided the following query is satisfied:
3.2.3. Bounded Overtaking (or Absence of Starvation)
The target process
experiments a bounded overtaking if the following suprema (
) query is satisfied with a finite value of
:
3.2.4. Liveness-1
Any process eventually enters its critical section provided the following existential query is satisfied, whatever the identity of the target process
:
3.2.5. Liveness-2
A process in NCS does not forbid other processes to enter their CS if the following query is satisfied:
From the model checking of the Uppaal model in
Figure 1 it emerged that the solution is correct from all the mutual exclusion properties. Properties were verified both when NCS is normal (an arbitrary time can elapse before the next competition of the process) and in the case the NCS location is urgent (a process immediately re-enters after complete access to its CS). The linear overtaking
was confirmed. In the case NCS is urgent, and
, the suprema query terminates after 792 s with a memory usage of 12.6 GB, on a Win11 Pro desktop platform, Dell XPS 8940, Intel i7-10700 (8 physical cores), CPU@2.90 GHz, and 32 GB RAM, using version 5 of Uppaal 64 bit.
The Algorithm 2 was refined in [
25] by using only three output bits per process (
:
ctive
aiting
hutting). The waiting room has only the input door. An active process wants to enter the waiting room. While waiting, further processes signaling they would also enter are allowed to gain the waiting room. Otherwise, the door is shut. Processes exit the waiting room according to their IDs. This algorithm too, with the correction noted in [
31], was modeled by our proposed Uppaal-based method with atomic memory. Model checking revealed that for
the model has no deadlock and admits a linear overtaking, but mutual exclusion can be violated. That is, multiple processes can enter their CS simultaneously.
The Uppaal model shown in
Figure 2 was derived from the algorithm in [
25] by tailoring it to
processes and by modifying the condition for abandoning the busy–waiting in P9. The current process is denoted by
and its partner by
. The informal algorithm behind
Figure 2 can be easily deduced from the model.
The model in
Figure 2 is a fully correct mutual exclusion solution (see also
Figure 3). It will be used later in this paper as the arbitration unit in a tournament tree organization [
11,
14,
26,
27] for
processes, under both atomic and non-atomic registers.
6. Conclusions
This paper proposes techniques for specifying register flickering and scrambling in mutual exclusion algorithms executing in non-atomic memory [
22]. The techniques are embedded into a formal methodology for modeling and exhaustive verification of mutual exclusion protocols. The methodology is based on timed automata [
16] and the Uppaal model checker [
19,
20,
21].
With respect to similar formal methods, e.g., based on process algebra and model checking [
22], this paper argues the proposed approach is more intuitive yet powerful to use.
Several examples were thoroughly modeled and verified using the consistency rules of non-atomic registers described in this paper. Observed results comply with results reported in [
22]. Some solutions were also extended for them to be used in the context of the tournament tree (TT) organization [
11,
14,
27]. TT provides a standard and efficient way of supporting mutual exclusion for
processes by relying on particular solutions for
processes used as the arbitration unit. As an original result of this paper, although Szymanski’s algorithm [
25] was proved by our work to be incorrect with both atomic and non-atomic registers, its version for
processes, modified according to the indications in [
31], is not only fully correct with non-atomic registers but exploited as the arbitration unit in TT is capable of furnishing correct mutual exclusion for
processes.
The paper also reports some experimental results concerning the verification efficiency of three flickering techniques. The so-called technique, which is a novel contribution of this paper, emerged to be the best one in all the carried-out experiments.
Ongoing and future work will be geared to the following points: first, to continue applying the developed non-atomic register consistency techniques to other mutual exclusion algorithms; second, to make a detailed experimental comparison of the proposed method with the approach described in [
22]; third, to consider the modeling of further consistency rules of non-atomic registers as proposed, e.g., in [
35].