A Coordination Technique for Improving Scalability of Byzantine Fault-Tolerant Consensus

: Among various consensus algorithms, the Byzantine Fault Tolerance (BFT)-based consensus algorithms are broadly used for private blockchain. However, as BFT-based consensus algorithms are structured for all participants to take part in a consensus process, a scalability issue becomes more noticeable. In this approach, we introduce a consensus coordinator to execute a conditionally BFT-based consensus algorithm by classifying transactions. Transactions are divided into equal and unequal transactions. Moreover, unequal transactions are divided again and classiﬁed as common and trouble transactions. After that, a consensus algorithm is only executed for trouble transactions, and BFT-based consensus algorithms can achieve scalability. For evaluating our approach, we carried out three experiments in response to three research questions. By applying our approach to PBFT, we obtained 4.75 times better performance than using only PBFT. In the other experiment, we applied our approach to IBFT of Hyperledger Besu, and our result shows a 61.81% performance improvement. In all experiments depending on the change of the number of blockchain nodes, we obtained the better performance than original BFT-based consensus algorithms; thus, we can conclude that our approach improved the scalability of original BFT-based consensus algorithms. We also showed a correlation between performance and trouble transactions associated with transaction issue intervals and the number of blockchain nodes.

representative example. In the BFT-based consensus algorithm, the number of network communications among participants explodes with the increase in the number of participants. This is because all participants in the BFT-based consensus algorithm should be involved to complete the process for each transaction, and the process is even composed of diverse steps. This characteristic of the BFT-based consensus algorithm may raise the performance and scalability issue of the algorithm, as the more participants are newly involved, the slower the completion of the consensus process is [15].
There have been several previous studies on improving scalability of BFT-based consensus algorithms (see [16][17][18][19][20][21][22][23]). Some works tried to build sub-groups of blockchain nodes on a regular basis and reduced the number of network communications by executing two-step execution of the BFT-consensus algorithm: executing consensus inside sub-groups and conducting consensus between representatives of each sub-group. While this approach increased the PBFT algorithm's scalability, it has a shortcoming that a new node cannot join any groups until new groups have been made. Other works also tried to optimize the number of communications by modifying the PBFT protocol with introducing a collector role or removing faulty nodes during consensus processes. Similarly, some works tried to reduce the number of prime node elections for minimizing the node election overhead. However, it is hard to expect the outstanding improvement of scalability through these approaches. In addition to these, other approach tried to deploy a new hardware-based BFT algorithm execution environment, but it has an apparent weakness that all nodes should prepare the specific hardware environments in advance.
To address the above issues, we propose a coordination technique for scalable Byzantine fault-tolerant consensus algorithms. The key idea is to introduce a Consensus Coordinator that controls conditional execution of a BFT-based consensus algorithm after classifying transactions of all nodes with respect to their equality. Our approach runs regularly associated with the block generation time interval, consisting of four steps. First, it starts with electing a prime node among all blockchain nodes for executing a BFT-based consensus algorithm and communicates with the consensus coordinator. Then, the consensus coordinator collects transactions from a transaction pool of each node. In the third step, the coordinator classifies transactions based on their equality and decides for executing a consensus algorithm. In the case that all transactions are equal, the coordinator lets the prime node execute block generation without executing a consensus algorithm, which is the completion of the synchronization in a blockchain network. When some transactions are not equal, the coordinator divides transactions into common and trouble transactions and requests the prime node to execute a BFT-based consensus algorithm only for trouble transactions. The prime node notifies agreed transactions to the coordinator. Finally, the coordinator sorts all common and agreed transactions by time order and requests the prime node to generate a new block containing all of processed transactions.
For the evaluation of our approach, we conducted three experiments for answering three research questions. We measured performance of the PBFT algorithm with and without our approach. In an experiment, the PBFT equipped with our approach obtained an average of 4.75 times better performance than only using the PBFT. In addition, we applied our approach to Hyperledger Besu using the IBFT (Istanbul Byzantine Fault Tolerance) consensus algorithm and showed a 61.81% performance improvement, compared to using only IBFT. We also presented correlation of performance and trouble transactions associated with transaction issue interval and the number of blockchain nodes. The contributions of our approach are summarized as follows: • We propose a novel coordination technique to improve the scalability and performance of consensus algorithms, which is applicable to diverse BFT-based consensus algorithms. • Our approach was implemented and applied to PBFT and Hyperleder Besu consensus algorithm, opened as an open source project for public access. • We performed three experiments in response to three research questions and showed the feasibility of our approach.
The remainder of this paper is organized as follows. Section 2 presents BFT-based consensus algorithms as background and related work for improving the scalability of BFT-based consensus algorithms. Section 3 proposes our coordination technique and explains four steps for achieving the scalable BFT-based consensus algorithm in detail. Section 4 presents the evaluation of our approach by responding to our three research questions. Section 5 concludes our paper and discusses future works.

Background and Related Work
This section presents BFT-based consensus algorithms and their characteristics as background and introduces some previous works regarding how to improve BFT-based consensus algorithms' scalability.

Background: BFT-Based Consensus Algorithms
BFT (Byzantine Fault Tolerance)-based consensus algorithms indicate a group of consensus algorithms for resolving the byzantine general problem regarding how to achieve a consensus of data in an environment where normal and malicious nodes are mixed [24]. The representative example is PBFT (Practical Byzantine Fault Tolerance) [13,14] in the Hyperledger Fabric 0.6 (Hyperledger Fabric later changed the PBFT into Raft since version 2.0 [25]), and diverse variations of PBFT such as Tendermint in Cosmos [26], Hotstuff in Libra [27], and IBFT in Hyperledger Besu [28] are broadly used. BFT-based consensus algorithms' characteristics are the fast finality of a transaction, which indicates that the transaction is immediately finalized once the transaction is issued by a client and validated by N = 3 f + 1 participants. However, in other consensus algorithms such as PoW and PoS, a client should wait until their transaction is contained in a new block after issuing transactions. In Bitcoin, for example, it takes 1 h to finalize transactions theoretically. In the worse case, it often takes more time when a block is forked. Despite the fast finality of BFT-based consensus algorithms, it is inevitable to decrease performance and scalability when the number of nodes is increased. This is because all participants should join the consensus process, and two over three nodes should agree with transactions by communicating with each other in four steps: pre-prepare, prepare, commit, and reply ( Figure 1). Thus, this mechanism always causes the scalability issue depending on the number of nodes [15].

Related Work
Many approaches have been suggested to improve the scalability of BFT-based consensus algorithms. Most of their methods tried to reduce the number of network communications. To control the number of nodes participating in the consensus protocol, Feng et al. suggested the SDMA (Scalable Dynamic Multi-Agent)-PBFT approach that reduces the number of participants [16]. The approach builds sub-groups among the peers and elect an agent as a primary node in each sub-group. Then, it carries out the consensus process in sub-groups at first, and the second consensus process is performed only among the agents. While this approach increases the PBFT algorithm's scalability by reducing communication paths from the established blockchain network, it has a shortcoming that a new node cannot join any previously established groups until new groups have been made.
Similar to Feng et al.'s approach, Luu et al. proposed SCP (Scalable Byzantine Consensus Protocol) by executing the first consensus algorithm within sub-groups and the second consensus algorithm among group leaders from a result of the first execution [17]. The approach builds sub-groups by generating a random group number based on their IP address, public key, and nonce, while Feng et al.'s approach builds sub-groups by making the spanning-tree from a root node. Although this research contributed to reducing communication paths, it still has a similar problem that new nodes cannot easily join a blockchain network as in the approach of Feng et al.
As another approach, some research tried to optimize the number of communications by introducing collector role or removing faulty nodes during consensus processes. Kotl et al. proposed a new BFT-based consensus protocol, named Zyzzyva, where the number of non-faulty nodes for PBFT adaptively changes from N = 3 f + 1 into N = 2 f + 1 when a faulty node is detected during a consensus process [18]. Gueta et al. suggested that SBFT (State-of-the-art Byzantine Fault Tolerant) [19] reduces the number of communication paths among nodes by collecting messages in a consensus process into two collector nodes and validates messages in limited places. Similar to Gueta et al.'s approaches, Jiang et al. suggested HSBFT (High Performance and Scalable Byzantine Fault Tolerance), which makes a prime node to play a collector role that collects all messages and validates them [20]. HSBFT has a prime node electing process based on a node stable table containing identity number, state, IP, and public key. Based on the table, HSBFT excludes unstable nodes and optimizes communication paths. Although three approaches reduce the number of participant nodes and communication paths, it is hard to expect an outstanding improvement regarding scalability or performance.
In addition, Lei et al. proposed RBFT (Reputation-based Byzantine Fault Tolerance) algorithm for reducing communication paths in a private blockchain [21]. Each blockchain node computes their reputation score based on evaluation of their behaviors (e.g., a good behavior for generating a new block) and the number of permitted votes of each node is different depending on the reputation score. Votes are used to decide pass of PBFT's steps by checking if the number of votes is over a specific threshold. Although it can reduce the number of communications, only limited nodes can have an out-sized influence on the voting process.
Some research tries to minimize the number of prime node election processes for enhancing scalability. Gao et al. proposed a trust eigen-based PBFT consensus algorithm, which is called T-PBFT [22]. In the approach, they tried to minimize the number of the prime node elections based on each node's trust evaluation. Before starting the PBFT consensus process, the proposed eigen trust model evaluates all nodes' trust scores and makes a group called a primary group. Then, the consensus process is composed of two steps: (1) the consensus within the primary group; and (2) the consensus between the remaining nodes and the primary group. It can improve scalability by reducing change the proportion of the single primary node. However, its number of communications is the same as the PBFT, so it is hard to expect a distinct improvement of scalability.
A new hardware-based execution environment is also introduced for improving performance of the BFT-based consensus algorithm. Liu et al. proposed a hardware-based BFT algorithm execution environment, which is named Fast BFT [23]. All nodes use a hardware chip (e.g., Intel SGX) for using TEE (Trusted Execution Environment) to execute the consensus algorithm and the TEE supports public key operation (e.g., multi signatures) during the consensus process. They also suggested the new Fast BFT algorithm that reduces verification steps by collaborating with TEE. While they improved the BFT algorithm's scalability, their assumptions that all nodes should be executed upon TEE must be their limitations.

A Coordination Technique for Scalable BFT Consensus
This section presents a coordination technique for achieving the scalability of BFT-based consensus algorithms. Our approach is composed of two parts: Our Coordination Technique and BFT-based Consensus Algorithm ( Figure 2). The BFT-based Consensus Algorithm part located at the bottom of the figure indicates traditional BFT-based algorithms such as PBFT and IBFT. It is composed of one prime node that controls consensus process and other general nodes similar to general BFT-based blockchain platforms. Each node has BFT-Module controlling the consensus process and generating new blocks and Transaction Pool maintaining unconfirmed transactions. Thus, the BFT-Module accesses transactions of the transaction pool regularly and executes a BFT-based consensus algorithm. Once all nodes have been achieved, the consensus of transactions, the BFT-Module produces a new block after accumulating agreed transactions.
Our Coordination Technique part corresponds to the top of the figure, and it is also located to the part above the BFT-based Consensus Algorithm part. In the technique, we newly introduce the Consensus Coordinator that controls each node's BFT-based consensus algorithms' conditional execution depending on the equality of transactions. Our consensus coordination technique consists of four steps. (1) The prime node is elected among all participating nodes. (2) The coordinator collects all transactions that existed in the transaction pool of each node. (3) The coordinator checks the equality of transactions and classifies transactions into common and trouble transactions. For trouble transactions, the coordinator requests a prime node to execute a consensus algorithm and obtains agreed transactions. (4) The coordinator merges common and agreed transactions and requests the controller of all nodes to execute block generation with merged transactions. In the following subsections, the steps are described in more detail.

Step 1. Electing a Prime Node
The first step is to elect a prime node among all participant nodes. The elected prime node plays a role of interacting with a consensus coordinator. This election step runs each regular t time (we assumed that all nodes have the same time unit through a logical clock or a physical clock algorithm (e.g., [29][30][31])). Once all steps are completed, this prime node election step is carried out again. The algorithm of this step is shown in Algorithm 1.

Algorithm 1 [All Nodes]
Electing Prime Node 1: random = Random(seed) % N(Node) 2: if random equals to Node i then 3: sig prime = Signature (Node i , seed) sk prime 4: notify (sig prime , pk prime ) to CC 5: end if Electing the prime node starts with a seed which is a previous block hash value, and modulates the random number with the seed by total number of nodes N(node) to get a prime node number. The hash value of a previous block must be the same throughout all nodes because they are already agreed in a previous round. In the case that the result random from the random algorithm is equal to a unique number of each node Node i that is assigned before, a node is elected as a prime node. After signing its node number Node i and seed with its private key sk prime , it notifies the consensus coordinator CC with the result of sign sig prime and its public key pk prime .

Step 2. Collecting Transactions from Transaction Pool
Once the coordinator gets the prime node election notification from the prime node, it collects all transactions from each node's transaction pool as the second step. This collection step is presented in Algorithm 2. The input parameters of this step are sig prime and pk prime from the prime node. Then, the coordinator checks elected prime node's validity by generating a random number with the delivered seed and Node i (see . If the generated random is equals to Node i received from the prime node, the consensus coordinator requests all nodes to send all transactions accumulated in the transaction pool of each node between the previous time time p to the current time time c . When random is different from Node i , the coordinator terminates all coordination steps of this round and waits for the next idle state.

Algorithm 2 [Coordinator]
Collecting Transactions from TxPool 1: Inputs: sig prime , pk prime 2: Initialize: Txs ← {} 3: Node i , seed = Signature(sig prime ) pk prime 4: random = Random(seed) % N(Node) 5: if random equals to Node i then 6: for i=0, i ≤ N(Node) , i++ do 7:    isCon f irmed, sig prime , pk prime ← request Node prime to confirm Txs 0 8: if isCon f irmed == true then 9: request All Nodes to generate a new block with Txs 0 and sig prime If all transactions are the same, the coordinator requests the prime node Node Prime to confirm transactions Txs 0 and the prime node responses to the confirmation of transactions. This confirmation step is necessary for mutual trust between the coordinator and the prime node on integrity of transactions from the coordinator. The response from the prime node includes isCon f irmed, sig prime , and pk prime . Among the responses, sig prime results from execution Signature(Txs 0 ) sk prime of the prime node (see . When the prime node confirms all equal transactions, the coordinator requests all nodes of blockchain network to generate a new block. Then, the controller receives the request and delegates the request to the BFT-Module to generate a new block. All nodes' time p is updated with the current time c for designating the starting period for the next round (see Line 10). If the prime node does not confirm transactions, this round is terminated and time p remains at the previous time. In addition, when some of the transactions are different, the coordinator performs the handle unequal transactions step presented in the next subsection. for j=0, j ≤ N(Txs i ) , j++ do 5: isCommon ← true 6: for k=0, k ≤ N(Node) & i = k, k++ do 7: if Tx j does not exist in Txs k then 8: isCommon ← f alse 9: break 10: end if 11: end for 12: if isCommon == true then 13: List comm ← Tx j (2) Executing a Consensus Algorithm: For trouble transactions, List tr from the previous sub-step, the coordinator requests the prime node to execute a consensus algorithm and obtains a list of agreed transactions List agg from the prime node, denoted on Line 19 Algorithm 4. It should be noted that any BFT-based consensus algorithms can be applied in this step. All transactions in List tr cannot be contained in List agg , because some of the transactions might not be completed in the consensus algorithm. At that time, transactions not agreed upon are removed according to the BFT-based consensus algorithm.
(3) Sorting All Transactions: Based on common transactions List comm and agreed transactions List agg , the coordinator merges and sorts them in time order in this step. Then, the coordinator requests the prime node to confirm merged transactions similar to the process of equal transactions (see Algorithm 5, Line 5). The sig prime is produced by the prime node using Signature(SortedList csn ) sk prime . If transactions are confirmed, the coordinator requests all nodes to generate a new block with the SortedList csn , and all nodes' List comm , List agg 2: Initialize: List csn ← {}, SortedList csn ← {} 3: List csn ← (List comm ∪ List agg ) 4: SortedList csn ← sort (List csn ) 5: isCon f irmed, sig prime , pk prime ← request Node prime to confirm SortedList csn 6: if isCon f irmed == true then 7: request All Nodes to generate a new block with SortedList csn and sig prime

Step 4. Generating a New Block
The last step is to generate a new block that is relied on blockchain platforms. The coordinator does not intervene in this final step. Thus, all nodes create a new block with transferred transactions and a previous block's hash. The controller, for requesting the BFT-Module to generate a new block and accessing transaction pools, is developed for each blockchain platform and our approach can apply to diverse BFT-based consensus algorithms.

Evaluation
This section describes our experiments' results designed to evaluate our approach. For this evaluation, we established the three research questions below and carried out three experiments in response to research questions.
• RQ1: How much can the scalability of PBFT be increased through our proposed approach? • RQ2: What is the correlation between the trouble transactions and performance? • RQ3: How much can our approach improve the scalability of IBFT of Hyperledger Besu?

RQ1: How Much Can the Scalability of PBFT be Increased through our Proposed Approach?
This first research question is intended to figure out how much our suggested approach achieves our research aim which is to improve the scalability of the BFT-based consensus algorithm. We selected and implemented the PBFT consensus algorithm for this research question, which is the most popular BFT-based consensus algorithm. Then, we measured the performance of the PBFT equipped with and without our approach. Furthermore, we increased the number of nodes to figure out the scalability of our approach.
Experimental setting for RQ1. To respond to RQ1, we built the PBFT network (Our source code for implementing the PBFT network is available at https://github.com/jungwonrs/JwRalph_ Seo/tree/master/lab/Agent_Consensus) based on the Castro and Liskov's research [13,14]. Initially, we structured four nodes and issued transactions every 10 ms for 10,000,000 (ms), so that we transmitted one million transactions for 2 h 40 min. In addition, we set a block generation time to be 10 ms, which implies that each node generates a new block every 10 s with transactions in their transaction pools (by using the sensitivity analysis, we obtained 10 s as the best block generation time for the best performance.). Then, we measured the total elapsed time until the consensus process of transactions is complete and computed an average elapsed time. We prepared 81 physical computers and deployed 80 PBFT nodes and one consensus coordinator into each computer. The hardware specification of each computer was Intel i5-3570 3.4 GHz CPU with 4GB RAM, and Windows 10 OS installed.
Experimental Result for RQ1. Figure 4 shows the result of the experiment. In the initial experiment with four nodes, the PBFT with the consensus coordinator expressed in CC + PBFT achieved 0.0328 s for each transaction on average, while PBFT without our approach denoted as PBFT showed 0.1237 s. The gap of the elapsed time of two approaches grew as the number of nodes increased. When the number of nodes reached 80, the elapsed time of PBFT and CC + PBFT became 1.1191 and 6.2212 s, respectively. Thus, the PBFT equipped with our approach obtained 3.77 times (=0.1237/0.0328) higher performance than PBFT with the initial four nodes. The performance gain was increased so that the PBFT with our approach achieved 5.56 (=6.2212/1.1191) times higher performance with 80 nodes. In all experiments throughout node changes, the performance of the PBFT with our approach increased an average of 4.75 times compared to the use of the PBFT alone. Therefore, we can recognize that our approach contributed to improving the performance of the PBFT consensus algorithm. In addition to this, we observed that the elapsed time increase rate of the PBFT is bigger than that of CC + PBFT. While the increase rate of the PBFT from 4 to 80 nodes is 50.29 times (=6.2212/0.1237), that of the CC + PBFT is 34.12 (=1.1191/0.0328). It implies that our approach contributes to improving PBFT consensus algorithm's scalability depending on the increase of nodes compared to only using the PBFT. The reason for the increase of the elapsed time of PBFT is because all nodes in the PBFT algorithm should participate in consensus process and the number of communications. In addition, the consensus process should always be executed for all transactions (i.e., one million transactions). However, our approach checks the equality of transactions and executes the consensus process only for trouble transactions. Thus, depending on the proportion of trouble transactions associated with the number of the nodes, the elapsed time of CC + PBFT is increased but not as steeply as that of PBFT.

RQ2: What is the Correlation between the Trouble Transactions and Performance?
The second research question is for finding out how much our approach can contribute to the performance improvement of the BFT-based consensus algorithm. In the real-world, all blockchain nodes issue transactions, respectively, and it may rarely happen that all transactions in a transaction pool are equal. According to Donet and Pérez-Solà's experiment [32], transaction propagation time in the Bitcoin network composed of 344 nodes took 35 min on average, which means that transactions in each node's transaction pool are commonly different. In this research question, we build an environment where each node has many trouble transactions as in the real-world by controlling the interval of the transaction issue time and computed correlation between the elapsed time and the proportion of trouble transactions as τ.
Experimental setting for RQ2. To simulate the environment, we started with the experimental setting for RQ1 with the same hardware specification and issued one million transactions by controlling interval of transaction issue time from every 15 to 1 ms. Then, we measured a total elapsed time and obtained an average elapsed time for each transaction by dividing the total elapsed time by the number of transactions, as shown in Table 1. In addition to this, we observed the proportion of the trouble transaction τ in the consensus coordinator to obtain the correlation between trouble transactions and the elapsed time.
Experimental results for RQ2. Table 1 shows the result of the experiment. In the table, Columns 15 ms, 10 ms, 5 ms, 2 ms and 1 ms denote the time interval of the transaction issue and we only selected some of the representative time intervals. We computed τ by averaging the proportion of trouble transactions in a transaction pool collected from each node for every time p ∼ time c period (i.e., 10 s). Based on the τ, we highlight cells with the same colors associated with its τ value (see Color Legend in the table). When the number of nodes is 16, and a transaction is issued every 15 ms, our approach obtained 0.099 s for each of the average time interval (see the italic in the table). Depending on decreasing the transaction issue time interval, the average elapsed time and τ were increased. In addition, an increase in the number of nodes causes an increase in the average elapsed time and τ.
From the result, we established the correlation between τ and the average elapsed time for a transaction Avg.Elap.Time tx based on the transaction issue time interval tit and the number of nodes N(Node) as Equation (1). The equation implies that the average elapsed time is δ times proportional to the proportion of the trouble transaction τ. In addition, τ is inversely proportional to the transaction issue interval tit and it has logarithmic relation with the number of nodes N(Node). In the experiment, we obtained δ = 2.5, α = 2, and β = 0.1, which indicates that the proportion of the trouble transaction strongly affects the average elapsed time for each transaction.

RQ3: How Much Can Our Approach Improve the Scalability of IBFT of Hyperledger Besu?
We established the third research question regarding the applicability of our approach to the real-world open-source blockchain framework using another BFT-based consensus algorithm. We selected Hyperledger Besu, a popular implementation of Ethereum client that supports public and private blockchain. It uses IBFT (Istanbul Byzantine Fault Tolerance) that enhances the performance by decreasing the number of nodes for transaction confirmation from 3 f + 1 to 2 f + 1 (IBFT https://github.com/ethereum/EIPs/issues/650). In the experiment for RQ3, we modified the Hyperledger Besu source code to communicate with our Controller for requesting new block generation and accessing transaction pool (our source code for implementing the BFT network is available at https://:github.com/jungwonrs/JwRalph_Seo/tree/master/lab/besu_backup). Then, we performed an experiment similar to that of RQ1 to figure out how much our approach can improve the IBFT consensus protocol's performance.
Experimental setting of RQ3. We modified Hyperledger Besu 1.5.1 (https://github.com/ hyperledger/besu/tree/1.5.1) to communicate with our consensus coordinator. For the experiment, we transmitted random transactions to Hyperledger Besu on a regular basis during the designated period. Initially, the number of nodes was four and we gradually increased the number of nodes until it reached 40. Then, we measured the number of transactions contained in generated blocks to recognize its throughput. We set the block generation time of the Hyperledger Besu to 10 s, and our coordinator execution interval was also set into 10 s because this time showed the best performance. The experiment was also carried out every 5 min (=300 s and 30 block generations) for each node configuration.
In the experiment, the interval of the transaction transmission started from 10 ms, because a significant number of transactions was missed in Hyperledger Besu in the case of under 10-ms interval. We carried out this experiment with the transaction transmission interval of 5 ms from 10 to 40 ms. Our hardware specification as a blockchain node and consensus coordinator was Intel i7-8700, 3.2 GHz CPU with 24 GB RAM and Windows 10 OS. All nodes and the consensus coordinator were executed on one computer. Due to the hardware specification limitation, the maximum node number of Hyperledger Besu was set to 40. In addition, the gas limitation of Hyperledger Besu was removed to generate transactions continuously (we refer to the method shown on the official Besu website: https://besu.hyperledger.org/en/stable/HowTo/Configure/FreeGas/).
Experimental results of RQ3. Figure 5 shows the result of the experiment where its transmission interval is 10 ms with from 4 to 40 blockchain nodes. In the figure, results of only using the IBFT and IBFT equipped with our approach are expressed in IBFT and CC + IBFT, respectively.  Figure 6 shows the result of the experiment where the interval is 25 ms. As the transaction time interval is 25 ms, the total number of transactions that can be issued for 5 min is 12,000 (=300/0.025), which is the maximum number of transactions that can be contained in generated blocks. In four blockchain nodes, the numbers of IBFT and CC + IBFT are 11,900 and 12,000, respectively, in which most of the issued transactions are contained in generated blocks. Thus, the gap between the two numbers of transactions is not big. However, for 20 nodes, CC + IBFT achieved 53.25% (=(11,255 − 7344)/7344) performance improvement. In addition, CC + IBFT obtained 344.81% (=(7584 − 1705)/1705) performance improvement in the case of 40 blockchain nodes, compared to only using the IBFT. In all node configurations, the combinational use of IBFT and our approach gained 61.81% (=((103,348 − 63,868)/63,868)) performance improvement on average. Thus, it is possible to conclude that our approach contributed to improving the performance of a specific blockchain node configuration and improving the IBFT consensus algorithm's scalability because the loss of performance is smaller depending on the increase of the blockchain nodes. All datasets resulting from this experiment are presented in the AppendixA. While carrying out this experiment, we observed that the proportion of equal and unequal transactions in consensus coordinator, as shown in Table 2. As the coordinator execution interval is 10 s, our approach's maximum number of execution is 30 for 5 min. Then, we counted the number of cases that carry out equal transactions, that is, the case of Step 3.1 Handling Equal Transactions and the number of unequal transaction cases that process Step 3.2 Handling Unequal Transactions. Each of them is expressed in Equal Txs and Unequal Txs in the table. In four nodes with 10 and 25 ms, all transactions of the transaction pool of all nodes are equal, which are 93.33% and 96.67% of 30 coordinator executions, respectively. However, the number of nodes is increased, the higher proportion of the case for equal transactions is decreased, and that of unequal transactions is increased. As a result, average equal and unequal transactions are 55.33% and 44.67% in the case of 10-ms transaction issue interval, while those of the 25-ms case are 72% and 28%, respectively. Thus, we recognized that our contribution to the performance and scalability improvement is positively associated with the proportion of equal transactions, as pointed by Equation (1).

Threats to Validity
Construct Validity. The results of RQ1, RQ2, and RQ3 may be influenced by hardware specification and the version of Hyperledger Besu. Although there are diverse factors related to performance, the experimental results, such as elapsed times and the number of transactions, can differ. However, we tried to carry out our experiments on the same hardware specification for control and experimental groups, so that the relative comparison of the result is considered desirable for our experiment. In addition, the result of the experiment for RQ3 can be different due to the version of Hyperledger Besu. We selected the 1.5.1 version of Hyperledger Besu in the experiment for RQ3, which was the most recent version at the time. However, the version upgrade is frequent so that the use of different versions would show different results.
Content Validity. In this paper, the concept of scalability is measured by the extent of the decrease in performance, depending on the increase in the number of nodes. Based on this definition, we keep measuring the performance gap, depending on node changes. Besides, we also defined that the term transaction issue indicates that a client issues one transaction, and the transaction is contained in a new block through block generation. However, the block generation interval was set to be every 10 s in the experiments for RQ1 and RQ3, while the unit of the measure of the elapsed time must be 10 s, which are not the exact time. Due to this issue, we performed our experiment for 2 h 40 min, which is a long time enough to ignore the 10 s time gap for measuring the average elapsed time for processing a transaction. In the experiment for RQ3, we fixed the experiment time into 30 block generations (i.e., 5 min) to resolve the issue.
Internal Validity. Experiment results may be affected by different sets of the interval execution of the consensus coordinator (i.e., time p ∼ time c ) and the block generation time for PBFT and IBFT in Hyperledger Besu. To handle this issue, we performed the sensitivity analysis and observed that setting the execution time interval of the coordinator and block generation time of PBFT and IBFT to 10 ms showed the best performance. However, the performance may be different depending on the interval setting.
External Validity. In this paper, we claim that our approach is efficient for BFT-based consensus algorithms, and we applied our approach to two consensus algorithms: PBFT and IBFT. It is hard to claim that our approach applies to all BFT-based consensus algorithms. However, we selected the most popular BFT-based consensus algorithm, PBFT. Many consensus algorithms such as IBFT, Zyzzyva [18], SBFT [19], Hotstuff [33,34], and Tendermint [26] are derived from PBFT. Although we selected IBFT as a representative derivation of PBFT, we argue that our approach can be applied to other BFT-based consensus algorithms.

Conclusions
This paper proposes a coordination technique for improving the scalability of BFT-based consensus algorithms. The technique is composed of four steps: (1) electing a prime node; (2) collecting transactions from transaction pools; (3) processing equal and unequal transactions; and (4) generating a new block. Our key idea is to control a conditional execution of the consensus algorithm by dividing the transaction pool into equal and unequal transactions and secondly dividing common and trouble transactions among unequal transactions. The consensus algorithm is then executed only for trouble transactions, and the results are merged and finalized through sharing the transactions throughout all blockchain nodes.
Based on this approach, we carried out three experiments to respond to three research questions. As a result of the experiments, the use of PBFT equipped with our approach showed 4.75 times the performance improvement on average compared to using PBFT only. In addition, our approach contributed to improving the performance by a maximum of 61.81% of the performance, compared to the single-use of IBFT. In addition to this, we showed the correlation of performance and trouble transactions associated with the transaction issue interval and the number of blockchain nodes.
Although our approach showed the scalability improvement of BFT-based consensus algorithms, it has explicit limitations. First, the consensus coordinator is centralized so that it exposes the coordinator to the single point failure issue. Second, our approach does not address the recovery issue of the coordinator when a system has failed or restarted. Third, our approach should be tested in the real-world environment where diverse synchronization issues exist such as clock synchronizations throughout distributed nodes. For future work, we plan to carry out more research on distributing the centralized consensus coordinator and establishing recovery strategy from the system failure in the real-world environment.