1. Introduction
The mutual exclusion problem is a fundamental process synchronization problem in concurrent systems [
1,
2,
3]. It is the problem of controlling the system in such a way that no two processes execute their critical sections (CSs) at a time. Generalizations of mutual exclusion have been studied extensively, e.g.,
k-mutual exclusion [
4,
5,
6,
7,
8,
9], mutual inclusion [
10] and
l-mutual inclusion [
11]. The
k-mutual exclusion problem is controlling the system in such a way that at most
k processes can execute their CSs at a time. The mutual inclusion problem is the complement of the mutual exclusion problem; unlike mutual exclusion, where at most one process is in the CS, mutual inclusion places at least one process in the CS. In a similar way, the
l-mutual inclusion problem is the complement of the
k-mutual exclusion problem; unlike
k-mutual exclusion, where at most
k processes are in the CSs,
l-mutual inclusion places at least
l processes in the CSs. These generalizations are unified to a framework “the critical section problem” in [
12]. Informally, the global
-CS problem is defined as follows. For each
, the global
-CS problem has at least
l and at most
k processes in the CSs in the entire network.
This paper discusses the generalized local CS problem, which is a new version of CS problems. When the numbers
and
are given for each process
, it is the problem of controlling the system in such a way that the number of processes that can execute their CSs at a time is at least
and at most
among its neighbors and itself. In this case, we call this problem “the generalized local
-critical section problem”. Note that the local
-CS problem assumes that the values of
l and
k are shared among all processes in the network, whereas the generalized local CS problem assumes that the values of
and
are set for each process
. These are the generalizations of local mutual exclusion [
13,
14,
15,
16,
17], local
k-mutual exclusion [
18] and local mutual inclusion [
11]. If every process has
, then the problem is the local mutual exclusion problem. If every process has
, then the problem is the local
k-mutual exclusion. If every process has
, then the problem is the local mutual inclusion problem, where
is the set of
’s neighboring processes. The global CS problem is a special case of the local CS problem when the network topology is complete. However, to the best of our knowledge, our algorithm in this paper is the first solution for the generalized local
-CS problem.
The generalized local -CS problem is interesting not only theoretically, but also practically, because it is useful for fault-tolerance and load balancing of distributed systems. For example, we can consider the following future applications.
One application is a formulation of the dynamic invocation of servers for the load balancing. The minimum number of servers that are always invoked for quick responses to requests for is . The number of servers is dynamically changed by the system load. However, the total number of servers is limited by available resources like bandwidth for , and the number is .
Another is fault-tolerance services if each process in the CS provides a service for the network. Because every process has direct access to at least servers, it guarantees fault-tolerant services. However, because providing services involve a significant cost, the number of servers should be limited at most for each process.
The other is that each process in the CS provides service A, and other processes provide service B for the network. Then, every process in the network has direct access to at least servers of A and has direct access to at least servers of B.
In each case, the numbers and can be set for each process.
In this paper, we propose a distributed algorithm for the generalized local -CS problem for arbitrary , where for each process . To this end, we first propose a distributed algorithm for the generalized local -CS problem (we call it generalized local -mutual inclusion problem). It is the first algorithm for the problem. Next, we show that the generalized local -CS algorithms and the generalized local -CS algorithms are interchangeable by swapping process state, in the CS and out of the CS. By using this relationship between these two problems, we propose a distributed algorithm for the generalized local -CS problem for arbitrary , where for each process . We assume that there is a process , such that , and for each .
This paper is organized as follows.
Section 2 provides several definitions and problem statements.
Section 3 provides a solution to the generalized local
-CS (i.e., generalized local
-mutual inclusion) problem.
Section 4 presents an observation on the relationships between the generalized local
-CS problem and the generalized local
-CS problem.
Section 5 provides a solution to the generalized local
-CS problem. In
Section 6, we give a conclusion and discuss future works.
2. Preliminaries
2.1. System Model
Let be a set of n processes and be a set of bidirectional communication links in a distributed system. Each communication link is FIFO. Then, the topology of the distributed system is represented as an undirected graph . By , we denote the set of neighboring processes of . That is, . By , we denote the distance between processes and . We assume that the distributed system is asynchronous, i.e., there is no global clock. A message is delivered eventually, but there is no upper bound on the delay time, and the running speed of a process may vary.
A set of local variables defines the local state of a process. By , we denote the local state of each process . A tuple of the local state of each process forms a configuration of a distributed system.
2.2. Problem
We assume that each process
maintains a variable
. For each configuration
C, let
(resp.,
) be the set of processes
with
(resp.,
) in
C. For each configuration
C and each process
, let
(resp.,
) be the set
(resp.,
). The behavior of each process
is as follows, where we assume that
eventually invokes entry-sequence when it is in the
state, and
eventually invokes exit-sequence when it is in the
state.
while true{ Entry-Sequence; ; /* Critical Section */ } Exit-Sequence; ; /* Remainder Section */ } } |
Definition 1. (The generalized local critical section problem). Assume that a pair of numbers and () is given for each process on network . Then, a protocol solves the generalized local critical section problem on G if and only if the following two conditions hold in each configuration C.
Safety: For each process , at any time.
Liveness: Each process changes and states alternately infinitely often.
We call the generalized local CS problem when and are given for each process “the generalized local -CS problem”.
We assume that the initial configuration is safe, that is each process satisfies . In the case of (resp., ), the initial state of each process can be (resp., ) because it satisfies the condition for the initial configuration. In the case of , the initial state of each process is obtained from a maximal independent set I as follows; a process is in the state if and only if it is in I. Note that existing works for CS problems assume that their initial configurations are safe. For example, for the mutual exclusion problem, most algorithms assume that each process is in the state initially, and some algorithms (e.g., token-based algorithms) assume that exactly one process is in the state and other processes are in the state initially. Hence our assumption for the initial configuration is common for existing algorithms.
2.3. Performance Measure
We apply the following performance measure as message complexity to the generalized local CS algorithm: the number of message exchanges triggered by a pair of invocations of exit-sequence and entry-sequence.
3. Proposed Algorithm for the Generalized Local -Mutual Inclusion
In this section, we propose an algorithm -LMUTIN for the case that .
First, we explain how the safety is guaranteed. Initially, the configuration is safe, that is each process satisfies . When wishes to be in the state, requests permission by sending a message for each process in . When obtains a permission by receiving a message from each process in , changes to the state. Each process grants permissions to processes at each time. Hence, at least processes in cannot be in the state at the same time. When wishes to be in the state, changes to the state and sends a message for each process in to manage the next request for exiting the CS.
Next, we explain how the liveness is guaranteed. We incorporate the timestamp mechanism proposed by [
19] in our algorithm. Based on the priority of the timestamp for each request to change the state, a process preempts a permission when necessary, as proposed in [
11,
20,
21]. The proposed algorithm uses
and
messages for this purpose.
In the proposed algorithm, each process maintains the following local variables.
: The current state of : or .
: The current value of the logical clock [
19].
: The number of grants that obtains for exiting the CS.
: A set of timestamps for the requests to ’s exiting the CS that has been granted, but that has not yet released.
: A set of timestamps for the requests to ’s exiting the CS that are pending.
: A process id such that preempts a permission for ’s exiting the CS if the preemption is in progress.
For each request, a pair
is used as its timestamp. We implicitly assume that the value of the logical clock [
19] is attached to each message exchanged. Thus, in the proposed algorithm, we omit a detailed description of the maintenance protocol for
. The timestamps are compared as follows:
iff
or
.
Formal description of the proposed algorithm for each process
is presented in Algorithms 1 and 2. When each process
receives a message, it invokes a corresponding message handler. Each message handler is executed atomically. That is, if a message handler is being executed, the arrival of a new message does not interrupt the message handler. In this algorithm description, we use the statement wait until (conditional expression). By this statement, a process is blocked until the conditional expression is true. While a process is blocked by the wait until statement and it receives a message, it invokes a corresponding message handler.
Algorithm 1 Local variables for process in algorithm -LMUTIN |
, initially : integer, initially 0; : integer, initially 0; : set of (integer, processID), initially ; : set of (integer, processID), initially ∅; : processID, initially nil; |
Algorithm 2 Algorithm -LMUTIN: exit-sequence, entry-sequence and message handlers. |
Exit-Sequence: ; ; for-each send to ; wait until ; ; Entry-Sequence: ; for-each send to ; On receipt of a message: ; if ; ; send to ; } else if ; if ; send to ; } } On receipt of a message: ; On receipt of a message: if ; Delete from ; if ; ; send to ; } On receipt of a message: if ; send to ; } On receipt of a message: ; Delete from , and let be the deleted item; ; ; ; send to ; |
3.1. Proof of Correctness
In this subsection, we show the correctness of -LMUTIN.
Lemma 1. (Safety) For each process , holds at any configuration C.
Proof. We assume that the initial configuration is safe, i.e., . Therefore, we consider the process , which becomes unsafe first for the contrary. Suppose that , that is . Because , consider a process , which became the state by the -th lowest timestamp among processes in . Then, obtains permission to be the state from each process in . This implies that receives a request from and that sends a permission to . Because grants at most permissions to exit the CS at each time, cannot obtain a permission from ; this is a contradiction. ☐
Lemma 2. (Liveness) Each process changes into the and states alternately infinitely often.
Proof. By contrast, suppose that some processes do not change into the and states alternately infinitely often. Let be such a process where the lowest timestamp value for its request to be the state is . Without loss of generality, we assume that is blocked in the state. That is, is blocked by the wait until statement in the exit-sequence (recall that each process changes into the state eventually when it is in the state). Let be any process in .
Suppose that changes into the and states alternately infinitely often. After receives the message from , the value of exceeds the timestamp for ’s request. Because, by this algorithm, the request with the lowest timestamp is granted preferentially, it is impossible for to change into the and states alternately infinitely often. Then, eventually sends a message to , and eventually sends a message to itself.
Suppose that does not change into the and states alternately infinitely often. Because the timestamp of is smaller than that of , by assumption, ’s permission is preempted, and a message is sent from to . In addition, sends a message to itself.
Therefore, eventually receives a message from each process in , and the wait until statement in the exit-sequence does not block forever. ☐
3.2. Performance Analysis
Lemma 3. The message complexity of -LMUTINfor is in the best case and in the worst case.
Proof. First, let us consider the best case. In exit-sequence, for ’s exiting the CS, sends a message to each process in ; each process in sends a message to . In entry-sequence, after ’s entering the CS, sends a message to each process in . Thus, messages are exchanged.
Next, let us consider the worst case. For ’s exiting the CS, sends a message to each process in . Then, sends a message to the process to which sends a message, sends a message back to and sends a message to . After ’s entering the CS, sends a message to each process in . Then, sends a message to return a grant to or grant to some process with the highest priority in . Thus, messages are exchanged. ☐
Theorem 1. -LMUTIN solves the generalized local -critical section problem with a message complexity of , where Δ is the maximum degree of a network.
4. The Generalized Local Complementary Theorem
In this section, we discuss the relationship between the generalized local CS problems.
Let be an algorithm for the global -CS problem, and be an algorithm for the generalized local -CS problem. By Co-(resp., Co-), we denote a complement algorithm of (resp., ), which is obtained by swapping the process states, and .
In [
12], it is shown that the complement of
is a solution to the global
-CS problem. We call this relation the complementary theorem. Now, we show the generalization of the complementary theorem for the settings of local CS problems.
Theorem 2. For each process , a pair of numbers and is given. Then, is an algorithm for the generalized local -CS problem.
Proof. By , at least and at most processes among each process and its neighbors are in the CS. Hence, by , at least and at most processes among each process and its neighbors are out of the CS. That is, at least and at most processes among each process and its neighbors are in the CS. ☐
By Theorem 2, Co-(-LMUTIN) is an algorithm for the generalized local -CS problem where . We call it -LMUTEX.
5. Proposed Algorithm for the Generalized Local CS Problem
In this section, we propose an algorithm LKCSfor the generalized local -CS problem for arbitrary , where for each process . We assume that, the initial configuration is safe. Before we explain the technical details of LKCS, we explain the basic idea behind it.
5.1. Idea
The main strategy in LKCS is the composition of two algorithms, -LMUTIN and -LMUTEX. In the following description, we simply call these algorithms LMUTIN and LMUTEX, respectively. The idea of the composition in LKCS is as follows.
Exit-Sequence:
Entry-Sequence:
This idea does not violate the safety by the following observation.
Exit-sequence keeps the safety because invocation of exit-sequence for LMUTIN keeps the safety, and invocation of exit-sequence for LMUTEX trivially keeps the safety.
Similarly, entry-sequence keeps the safety because invocation of entry-sequence for LMUTEX keeps the safety, and invocation of entry-sequence for LMUTIN trivially keeps the safety.
Because invocations of exit-sequence for LMUTIN in exit-sequence and entry-sequence for LMUTEX in entry-sequence may block a process forever, i.e., deadlocks and starvations, we need some mechanism to such situation which makes the proposed algorithm non-trivial.
A problem in the above idea is the possibility of deadlocks in the following situation. There is a process with such that or has a neighbor with . Then, cannot change its state by exit-sequence until at least one of its neighbors with changes ’s state by entry-sequence. If or has a neighbor with , cannot change its state by entry-sequence until at least one of its neighbors with changes ’s state by exit-sequence. In the network, if every process is in such situation, a deadlock occurs.
To avoid such a deadlock, we introduce a mechanism “sidetrack”, meaning that some processes reserve some grants, which are used only when the system is suspected to be a deadlock. Hence, in a normal situation, i.e., not suspected to be a deadlock, the number of processes in the CS is limited. In this sense, LKCS is a partial solution to the -CS problem unfortunately. Currently, a full solution to the problem is not known and left as a future task.
The idea of the “sidetrack” in LKCS is explained as follows. We select a process, say , with as a “leader”, and each process within two hops from the leader may allow at least and at most processes to be in the CSs locally in a normal situation. We assume that , because . Other processes may allow at least and at most processes to be in the CSs locally in any situation and . The leader observes the number of neighbor processes that may be blocked, and when the leader itself and all of the neighbors can be blocked, the leader suspects that the system is in a deadlock situation. Then, the leader designates a process within one hop (including the leader itself) to use the “sidetrack” to break the chain of cyclic blocking. Because the designated process uses one extra CS exit/entry, the number of processes in the CSs is at least and at most , and hence, LKCS does not deviate from the restriction of the -CS problem. The suspicion by the leader process is not always correct, i.e., may suspect that the system is in a deadlock when this is not true. However, incorrect suspicion does not violate the safety of the problem specification.
5.2. Details of LKCS
We explain the technical details of LKCS below. Formal description of LKCS for each process
is presented in Algorithms 3–7. The execution model of this algorithm is the same as the previous section, except that the while statement is used in LKCS. By while (conditional expression)
, a process is blocked until the conditional expression is true. While a process is blocked by this statement, it executes only the statement between braces and message handlers. While a process is blocked by this statement and it receives a message, it invokes a corresponding message handler. That is, if the statement between braces is empty, this while statement is same as wait until statement.
Algorithm 3 Local variables and macros for process in algorithm LKCS |
Local Variables: ; , initially : integer, initially 1; : set of processID, initially ∅; : set of (integer, processID), initially ; : set of (integer, processID), initially ∅; : (integer, processID), initially nil; Local Variable only for a leader : : set of (integer, processID), initially ∅; Macros: |
Algorithm 4 Algorithm LKCS: exit-sequence and entry-sequence. |
Exit-Sequence: ; ; for-each send to ; while if /* The configuration may be in a deadlock. */ ; wait until ; } } ; for-each send to ; Entry-Sequence: ; for-each send to ; while if /* The configuration may be in a deadlock. */ ; wait until ; } } ; for-each send to ; |
When the leader suspects that the system is in a deadlock, it invokes the function and selects a process within one hop ( itself or a neighbor of ) as a “trigger”, and sends a message (Trigger message type) to so that issues a special request. Then, sends a special request message (RequestByTrigger message type) to each neighbor . This message also cancels the current request of . After each receives such special request from , then cancels the request of by deleting from its pending list () and its granted list (), inserts the special request to its granted list and immediately grants by using the “sidetrack”. The deleted element is a request that or other neighbors of kept it waiting if the system is in a deadlock, and the inserted element cannot be preempted because it has the maximum priority.
We explain the technical details how the leader
suspects that the system is in a deadlock. When
, then
. Because a request is not sent if a previous one is kept waiting, two pending lists
and
are disjoint. Thus, if there is a neighbor
, which is not in these pending lists, then
’s request is granted by
, but is kept waiting by other neighbor than
in the deadlock configuration. That is,
’s request is in both of
. To the suspicion possible, we assume that, at the leader process,
holds, i.e.,
. The underlying LMUTIN (resp., LMUTEX) algorithm sends at most
(resp.,
) grants; the total number of grants of the two underlying algorithms is at most
. Because
,
holds. This implies that there exists at least a process
that receives both grants of LMUTIN and LMUTEX from
.
Algorithm 5 Algorithm LKCS: message handlers (1). |
On receipt of a message: ; if ; ; send to ; } else if ; if ; send to ; } } On receipt of a message: if ; } On receipt of a message: if ; Delete from ; if ; ; send to ; } On receipt of a message: if Delete from ; send to ; } On receipt of a message: ; Delete from , and let be the deleted item; ; ; ; send to ; |
If the system is in a deadlock, is definitely involved in the deadlock. Giving special grants by the sidetrack resolves the deadlock.
If the system is not in a deadlock, is not be involved in the deadlock. Furthermore, in this case, LKCS gives special grants by the sidetrack. This is because exact deadlock avoidance mechanisms require global information collection, and they incur large message complexity.
With this local observation at the leader
, the deadlock is avoided with less message complexity.
Algorithm 6 Algorithm LKCS: function for the leader . |
; for-each /* may be waiting for grant messages to enter. */ ; } } ; ; } send to ; ; for-each /* may be waiting for grant messages to exit. */ ; } } ; ; } send to ; } } |
Algorithm 7 Algorithm LKCS: message handlers (2). |
On receipt of a message: /* is waiting for grant messages to exit. */ ; for-each send to ; /* Request message as a trigger. */ /* is waiting for grant messages to enter. */ ; for-each send to ; /* Request message as a trigger. */ } On receipt of a message: Delete from ; Delete from ; ; send to ; |
In the proposed algorithm, each process maintains the following local variables, where is the algorithm type, or . These variables work as the same as those of -LMUTIN, and we omit the detailed description here.
: The current state of : or .
: The current value of the logical clock [
19].
: A set of process ids from which obtains grants for exiting/entering to the CS.
: A set of timestamps for the requests to ’s exiting/entering to the CS that has been granted but that has not yet released.
: A set of timestamps for the requests to ’s exiting/entering to the CS that are pending.
: A timestamp of a request such that preempts a permission for ’s exiting/entering to the CS if the preemption is in progress.
5.3. Proof of Correctness
In this subsection, we show the correctness of LKCS. We assume the initial configuration is safe. First, we show that the process with cannot become unsafe by the proof by contradiction. Next, we show that other processes cannot become unsafe because they normally execute the algorithm as their instance . Thus, we can derive the following lemma.
Lemma 4. (Safety) For each process , holds at any configuration C.
Proof. We assume that the initial configuration is safe. First, we consider the process for which holds becomes unsafe first in a configuration C for the contrary.
Suppose that , that is . Because , consider a process , which became the state by the -th lowest timestamp among processes in . Then, obtains permission to be the state from each process in . This implies that receives a permission request from and that sends a permission to . Because grants at most permissions to exit the CS at each time by the condition , cannot obtain a permission from ; this is a contradiction.
Suppose that . Because , consider a process that became the state by the -th lowest timestamp among processes in . Then, obtains permission to be the state from each process in . This implies that receives a permission request from and that sends a permission to . Because grants at most permissions to enter the CS at each time by the condition , cannot obtain a permission from ; this is a contradiction.
Next, we consider that has . Note that the leader sends a trigger request to exactly one of its neighbors or itself at a time. Let be the receiver. If does not request to invert its state as a trigger, we can discuss by the same way as above, and because of condition (of course, if , ). When requests to invert its state as a trigger by sending a message , all of its neighbors grant it without attention to , and inverts its state without attention to . Thus, becomes or (if , becomes or ). Therefore, does not become unsafe. ☐
Next, we consider that a deadlock occurs in a configuration. Then, processes waiting for grant messages constitute chains of deadlocks unless at least one process on chains changes its state. However, then, the leader process designates one of its neighbors or itself as a trigger, and the trigger changes its state by its preferential rights. Therefore, we can derive the following lemma.
Lemma 5. (Liveness) Each process changes into the and states alternately infinitely often.
Proof. For the contrary, we assume that a deadlock occurs in the configuration C. Let D be a set of processes that cannot change its state, that is they are in the deadlock. First, we assume that all of process in D has . Then, all of their neighbors have neighbors with . However, such neighbors are not in D and eventually change their state to . Thus, eventually can change its state; this is a contradiction. Therefore, in D, there is a process with such that it cannot change its state by the exit-sequence. Then, it holds or has a neighbor with . It is waiting for grant messages and cannot change its state until at least one of its neighbors with changes ’s state by the entry-sequence. If holds or has a neighbor with , cannot change its state by the entry-sequence until at least one of its neighbors with changes ’s state by the exit-sequence. By such chain relationship, it is clear that the waiting chain can be broken if at least one process on this chain changes its state. Thus, for the assumption, all processes in V are in such a chain in C, that is .
However, in such a C, all of and its neighbors are waiting for grant messages from their neighbors. That is, their requests are in or , and is equal to . Additionally, we assume that and the number of grants can send with (resp., ) is (resp., ). Because of safety, or holds. Then, sends a message to a neighbor and becomes a trigger if is also waiting for grant messages. Processes that receive grant the request without attention to . Then, can change its state, and after that, can change its state. Therefore, the waiting chain in C can be broken. This is a contradiction. ☐
5.4. Performance Analysis
Lemma 6. The message complexity of LKCS for is in the best case, and in the worst case.
Proof. First, let us consider the best case.
For ’s exiting the CS, sends a message to each process in ; each process in sends a message to ; then sends a message to each process in . Thus, messages are exchanged for ’s exiting the CS.
For ’s entering the CS, sends a message to each process in ; each process in sends a message to ; then sends a message to each process in . Thus, messages are exchanged for ’s entering the CS.
Thus, the message complexity is in the best case.
Next, let us consider the worst case.
For ’s exiting the CS, sends a message to each process in . Then, sends a message to the process to which sends a message; sends a message back to ; and sends a message to . After exits the CS, sends a message to each process in . Then, sends a message to some process with the highest priority in . Thus, messages are exchanged.
For ’s entering the CS, sends a message to each process in . Then, sends a message to the process to which sends a message; sends a message back to ; and sends a message to . After enters the CS, sends a message to each process in . Then, sends a message to some process with the highest priority in . Thus, messages are exchanged.
Thus, the message complexity is in the worst case. ☐
Theorem 3. LKCS solves the generalized local -critical section problem with a message complexity of , where Δ is the maximum degree of a network.
6. Conclusions
In this paper, we consider the generalized local -critical section problem, which is a new version of critical section problems. Because this problem is useful for fault-tolerance and load balancing of distributed systems, we can consider various future applications. We first proposed an algorithm for the generalized local -mutual inclusion. Next, we showed the generalized local complementary theorem. By using this theorem, we proposed an algorithm for the generalized local -critical section problem.
In the future, we plan to perform extensive simulations and confirm the performance of our algorithms under various application scenarios. Additionally, we plan to improve the proposed algorithm in message complexity and time complexity and to design an algorithm that guarantees exactly in every process.