A Hierarchy Byzantine Fault Tolerance Consensus Protocol Based on Node Reputation

A blockchain has been applied in many areas, such as cryptocurrency, smart cities and digital finance. The consensus protocol is the core part of the blockchain network, which addresses the problem of transaction consistency among the involved participants. However, the scalability, efficiency and security of the consensus protocol are greatly restricted with the increasing number of nodes. A Hierarchy Byzantine Fault Tolerance consensus protocol (HBFT) based on node reputation has been proposed. The two-layer hierarchy structure is designed to improve the scalability by assigning nodes to different layers. Each node only needs to exchange messages within its group, which deducts the communication complexity between nodes. Specifically, a reputation model is proposed to distinguish normal nodes from malicious ones by a punish and reward mechanism. It is applied to ensure that the malicious node merely existing in the bottom layer and the communication complexity in the high layer can be further lowered. Finally, a random selection mechanism is applied in the selection of the leader node. The mechanism can ensure the security of the blockchain network with the characteristics of unpredictability and randomicity. Some experimental results demonstrated that the proposed consensus protocol has excellent performance in comparison to some state-of-the-art models.


Introduction
A blockchain is one of the latest trends in distributed networks, where each node maintains an append-only ledger [1]. It has been applied to solve the trust challenge for large-scale collaborative works with the characteristics of decentralization, non-tampering and traceability [2][3][4]. According to the degree of decentralization, a blockchain is typically divided into three types, including public chain, consortium chain and private chain. The public chain [5] is a network that any node is allowed to participate in at any time. It is usually used in scenarios where a high latency and untrusted nodes are accepted. The consortium chain [6] is shared and managed by several institutions. In the private chain [7], the nodes are all controlled by one institution.
The consensus protocol plays a vital role in the blockchain for ordering transactions and guaranteeing the consistency of the ledger stored in the node. The consensus protocol [8] determines the performance of the blockchain system to a large extent, such as latency, transaction throughput, scalability and so on [9]. Proof-based consensus protocols are widely applied to many public blockchains, such as Proof-of-Work (PoW) [10] in the Bitcoin system, Proof-of-Stake (PoS) [11] and Delegated Proof-of Stake (DPoS) [12] in Ethereum. Such protocols are designed with an excellent node scalability through node competition. However, they are greatly energy-consuming and have a long transaction confirmation delay. For instance, [13] points out that the transaction confirmation delay is typically limited to 10 min in Bitcoin and 3 min in Ethereum, while in a consortium and private blockchain, lighter consensus protocols such as PBFT [14], Paxos [15] and results indicate that the proposed method has excellent performance in comparison to some state-of-the-art models.
To summarize, the main improvements of this paper are as follows: • A hierarchy structure is designed to assign nodes to two layers. It is more scalable, since it reduces the communication complexity of the nodes to linear, compared to the square level of PBFT. • A reputation model is proposed to evaluate the behaviors of nodes in the process of a transaction. It can be applied to disable malicious nodes from destroying the consensus and improve the security. The maximum proportion of the Byzantine nodes is higher than that in PBFT. • A random selection strategy has been proposed to leverage the concentration of the leader node. Nodes with higher reputation values cannot get more of a chance of being the leader for them. Additionally, such a strategy can be applied to increase the robustness of a network by enhancing the unpredictability of the next leader, which is chosen in order in PBFT.
The remainder of this paper is organized as follows. A combination of local and global reputation model is described in Section 2. Some details of the HBFT consensus protocol are presented in Section 3. The performance analysis is presented in Section 4. The experimental results are discussed in Section 5, followed by some conclusions in Section 6.

Combination of Local and Global Reputation Models
Both normal and malicious nodes are included in the blockchain network. Some consensus protocols, such as PBFT [14] and DBFT [25], do not differentiate normal or malicious nodes before and after consensus. Reputation models have been proposed by some works [30,31] to distinguish normal nodes from malicious ones. Since the combination of local and global reputations was not considered in [30,31], it did not perform well in preventing malicious nodes from gaining high-reputation values.
Some details of the proposed reputation model are given to overcome the limitations mentioned above. Firstly, a local reputation model is described in Section 2.1, which is applied to evaluate every transaction and score the nodes with a local reputation value. A global reputation model is proposed in Section 2.2 to incorporate the local reputation values of different nodes by designed rules.

Local Reputation Model
In the transaction phase, a node may involve numbers of transactions with different nodes in the network. Let t ij be the transaction score for node j as scored by node i. For security concerns that malicious nodes may deliberately underscore a normal node, a mutual assessment mechanism has been proposed. Node j gives the same score to node i after it receives the scores. T max is set as the max waiting time for a node. During a transaction event, if a node does not receive the correct response within T max , the transaction event is regarded as a failure, and the penalty mechanism is triggered. Otherwise, the nodes in this transaction event would be rewarded.
In the case of a reward, a reward function for node i (from any other nodes) is: where rewards i is the positive scores that node i obtains from other nodes during a single transaction event. * is a multiple operator. λ 1 is a reward moderator to the magnitude of the increasement in the nodes' reputation. R max denotes the maximum value of the reputation. R i is the current reputation of node i. h represents the number of historic rounds that are taken into account for node i, which will be discussed later. S count is set as the number of rounds that node i responses to properly in the past h rounds of transaction phases. T resp represents a time coefficient positively correlated with the node response time. T actual is the actual response time of the node. The idea behind this reward mechanism is that if node i successfully completes many transactions during a series of continuous transaction phases, the reputation reward magnitude of node i will decrease gradually accordingly.
While a node refuses or delays the transaction request from other nodes in a required period, the node that trades with it will give a negative evaluation. If a node keeps failing to finish transactions, the reputation of the node will decrease reversely. The penalty value for node i is calculated as: where decay i is the negative scores that node i gets from the other node during a single transaction event. λ 2 is a penalty moderator for the magnitude of decline in the nodes' reputation. R min denotes the minimum value of the reputation. Some previous works [30,31] tend to punish or reward nodes at a fixed value, which makes is difficult to differentiate normal and malicious nodes. In the proposed local reputation model, the reputation values of malicious nodes can be reduced in time, and the reputation values of normal nodes can be enhanced quickly. The scoring process in phase n is shown in Figure 1.
where rewardsi is the positive scores that node i obtains from other nodes during a single transaction event. * is a multiple operator. λ1 is a reward moderator to the magnitude of the increasement in the nodes' reputation. Rmax denotes the maximum value of the reputation. Ri is the current reputation of node i. h represents the number of historic rounds that are taken into account for node i, which will be discussed later. Scount is set as the number of rounds that node i responses to properly in the past h rounds of transaction phases. Tresp represents a time coefficient positively correlated with the node response time. Tactual is the actual response time of the node.
The idea behind this reward mechanism is that if node i successfully completes many transactions during a series of continuous transaction phases, the reputation reward magnitude of node i will decrease gradually accordingly.
While a node refuses or delays the transaction request from other nodes in a required period, the node that trades with it will give a negative evaluation. If a node keeps failing to finish transactions, the reputation of the node will decrease reversely. The penalty value for node i is calculated as: where decayi is the negative scores that node i gets from the other node during a single transaction event. λ2 is a penalty moderator for the magnitude of decline in the nodes' reputation. Rmin denotes the minimum value of the reputation. Some previous works [30,31] tend to punish or reward nodes at a fixed value, which makes is difficult to differentiate normal and malicious nodes. In the proposed local reputation model, the reputation values of malicious nodes can be reduced in time, and the reputation values of normal nodes can be enhanced quickly. The scoring process in phase n is shown in Figure 1.

Global Reputation Model
The proposed local reputation model can improve the security by punishing malicious nodes and rewarding normal ones. However, malicious nodes may attack normal ones by underscoring them many times at one transaction stage. A global reputation model is designed to avoid such attacks by comprehensively considering the scores of all other nodes. Let tij be the transaction score for node i evaluated by node j at a transaction event, which could be a reward or a decay, such as:

Global Reputation Model
The proposed local reputation model can improve the security by punishing malicious nodes and rewarding normal ones. However, malicious nodes may attack normal ones by underscoring them many times at one transaction stage. A global reputation model is designed to avoid such attacks by comprehensively considering the scores of all other nodes. Let t ij be the transaction score for node i evaluated by node j at a transaction event, which could be a reward or a decay, such as: where t r is the evaluation score of the rth transaction in N transactions. N is the number of transactions between i and j. The N transactions are independent from the evaluation events between nodes. where c i is the final increment of the reputation value of node i. P is the number of nodes in the network. a k is a weight distribution function, which is used to weight the contribution of the transaction scores from other nodes towards node i. a k will be discussed later. R i,n is the reputation of node i at the transaction phase n.
To more intuitively show the relationship between the local and global reputation models, the update process of the reputation values is described in Algorithm 1 and Figure 2 as follows: Algorithm 1 Update node reputation value

Hierarchy Structure
The data throughput of PBFT [14] is low due to the high communication complexit between nodes. A tree topology structure was proposed in [28,29] to reduce communica tion complexity by assigning nodes to several different layers. It can be applied to improv the data throughput of a network. However, it is difficult to ensure the consistency with malicious nodes existing in the high layer.
A two-layer hierarchical structure based on node reputation is designed to addres

Hierarchy Structure
The data throughput of PBFT [14] is low due to the high communication complexity between nodes. A tree topology structure was proposed in [28,29] to reduce communication complexity by assigning nodes to several different layers. It can be applied to improve the data throughput of a network. However, it is difficult to ensure the consistency with malicious nodes existing in the high layer.
A two-layer hierarchical structure based on node reputation is designed to address the problem mentioned above. Each node is assigned into one of the two layers according to its current reputation value. Nodes in the network are ranked by reputation values from high to low after the evaluation of the transaction stage. Some nodes in the top ranking of the reputation value are placed in the high layer, and the rest are placed in the low layer. In this way, both the node scalability and data throughput of the network can be improved.
Suppose there are H nodes in the high layer, then the nodes in the low layer are accordingly separated into H-1 groups, called subgroups. In the two-layer structure, there are clients, consensus nodes and subleader and leader nodes. The leader node is responsible for the collection of the transaction data and packages for blocks in the entire blockchain. The subleader node is the candidate of the leader node. It can participate in the verification of blocks and broadcast blocks to the consensus nod in its subgroup. The consensus node is responsible for verifying the blocks received from the subleader node. It can check and report malicious behavior by a subleader node. The client can initiate transactions by connecting to one of nodes in the blockchain. The overall hierarchy structure is shown in Figure 3. The number of nodes in each subgroup is not fixed and affects the final performanc of the network. The distribution strategy of the nodes in the low layer will be discussed later.
The block data transmission process of HBFT is described in Figure 4. The main processes are as follows: (1) Request phase. The client initiates a transaction request to the node it connects to. I signs the transaction with private key. The request format is: The number of nodes in each subgroup is not fixed and affects the final performance of the network. The distribution strategy of the nodes in the low layer will be discussed later.
The block data transmission process of HBFT is described in Figure 4.  The number of nodes in each subgroup is not fixed and affects the final performance of the network. The distribution strategy of the nodes in the low layer will be discussed later.
The block data transmission process of HBFT is described in Figure 4. The main processes are as follows: (1) Request phase. The client initiates a transaction request to the node it connects to. It signs the transaction with private key. The request format is: is a hash value of the transaction data d, c is a client identification, id is an identification number of the leader node The main processes are as follows: (1) Request phase. The client initiates a transaction request to the node it connects to. It signs the transaction with private key. The request format is: where t is the timestamp. d is the transaction data. g(d) is a hash value of the transaction data d, c is a client identification, id is an identification number of the leader node and Sig c is the client signature. The node verifies the identity of the client and the timestamp t in the blockchain. If the authentication is successful and the timestamp is not out of date, the node sends it to the leader node. (2) Prepare phase. After receiving the request message from the client, the leader node will order and package the transaction data into a block and then broadcast it to the subleader node. The format of the message is: where m is request message, and Sig leader is the signature of the leader node. (3) Lpre-prepare phase. The subleader node confirms whether the signature is correct.
If the verification is successful, it signs with Sig sub and forwards the message to the nodes in its subgroup. The format of the message is: (4) Lprepare phase. After the consensus node in the subgroup verifies the signatures of both the leader node and subleader node, it will send a confirmation message signed with Sig i to the other nodes in the same subgroup, called the lprepare message. The format of the message is: (5) Lcommit phase. When a consensus node collects 2f +1 correct lprepare messages, it will send a commit message to the subleader node. The format of the message is: (6) Reply phase. If the subleader node receives more than half-valid messages in the lcommit phase, it commits the block to the blockchain and replies to the client. The format of the message is:

Leader Selection Mechanism
The leader node is responsible for the block packaging and distribution. The state of the leader node determines the security and efficiency of the consensus protocol. The leader node in [10,11] was usually selected as the one who occupied the most computation power or stakes. However, it is energy-consuming and may cause an insecure concentration of the computation power or stakes. A consistent and trust fusion method was proposed in [29], where a node with a higher reputation is more likely to be selected as a leader. It does not consume much energy; however, it harms the fairness of the blockchain and may cause an insecure concentration of the reputation. A random selection consensus protocol was presented in [32]. It adopted a verifiable random function to select a committee that includes a leader node as well as a set of verifier nodes. It guarantees the randomness of the selection process. However, it requires the frequent replacement of committee members, which is inefficient.
To overcome the limitations mentioned above, a random selection mechanism is designed. Every node is qualified to participate in the consensus process. In order to enable a node to prove that it is selected as the leader, the mechanism requires node i to have a public/private key pair (pki and ski), and the nodes in the network do not keep any private state, except for their private keys. The mechanism is implemented using verifiable functions (VRFs) [33]. For any legal input x, VRF sk (x) returns two values: a hash and a proof. The hash is a hashlen-bit-long value that is only determined by sk and x. For any node that does not know sk, it is random and indistinguishable. The proof enables the nodes to verify that the hash truly corresponds to x without knowing sk.
At the beginning of each consensus epoch, there is a short phase that the nodes need to calculate with the given seeds using VRF and exchange messages for their computation results. The minimum one is selected as leader of this epoch as the one titled with leader in Figure 5. The items in the shared data are publicly known by every node. The randomness of the proposed selection mechanism comes from a publicly known seed. The processes of seed generation and distribution are shown in Figure 5. in Figure 5. The items in the shared data are publicly known by every node. The randomness of the proposed selection mechanism comes from a publicly known seed. The processes of seed generation and distribution are shown in Figure 5. The seeds should be public and cannot be controlled by the attacker. For each epoch of consensus, a new seed is generated. The seed of epoch r is determined by the current leader using VRFs with seeds in the previous epoch. seedr and proof cert can be calculated as follows: The value of seed0 can be chosen randomly using distributed random number generation [34]. The seed and corresponding proof are additionally added into the proposed block. As long as the block of epoch r reaches a consensus, every node knows the seed for the next epoch. The block broadcast process has been illustrated in Figure 4. The node can verify that seedr is indeed produced from seedr-1 by the leader with the leader's pk and cert.
The leader selection mechanism is triggered by the following conditions. Firstly, the leader node has its own term of office called the epoch. An epoch usually includes multiple rounds of the consensus. Once the epoch of the leader node is reached, the nodes in the consensus group will automatically select another leader. This can improve the decentralization of the network, as each node has the opportunity to be the leader node in a limited time. Secondly, the leader node may crash unexpectedly due to a network delay or other reasons. In order to keep the network working, the epoch of the current leader node is killed, and another leader is selected. The proposed random selection mechanism can ensure the fairness and security of the network by selecting the leader node randomly.

Communication Complexity
Communication complexity is an important index of the consensus protocols. It can be reflected by the number of communications times. The communication complexity of PBFT [14] is O(P 2 ). It means the number of communication times increases exponentially with P. In our HBFT, the communication complexity is optimized.
Suppose there are l nodes in each subgroup and the number of nodes in each sub- The seeds should be public and cannot be controlled by the attacker. For each epoch of consensus, a new seed is generated. The seed of epoch r is determined by the current leader using VRFs with seeds in the previous epoch. seed r and proof cert can be calculated as follows: seed r = VRF_hash(sk i , seed r−1 ) cert = VRF_proo f (sk i , seed r−1 ) The value of seed 0 can be chosen randomly using distributed random number generation [34]. The seed and corresponding proof are additionally added into the proposed block. As long as the block of epoch r reaches a consensus, every node knows the seed for the next epoch. The block broadcast process has been illustrated in Figure 4. The node can verify that seed r is indeed produced from seed r−1 by the leader with the leader's pk and cert. seed r = VRF_P2H(cert) True/False = VRF_veri f y(pki, seed r−1 , cert) The leader selection mechanism is triggered by the following conditions. Firstly, the leader node has its own term of office called the epoch. An epoch usually includes multiple rounds of the consensus. Once the epoch of the leader node is reached, the nodes in the consensus group will automatically select another leader. This can improve the decentralization of the network, as each node has the opportunity to be the leader node in a limited time. Secondly, the leader node may crash unexpectedly due to a network delay or other reasons. In order to keep the network working, the epoch of the current leader node is killed, and another leader is selected. The proposed random selection mechanism can ensure the fairness and security of the network by selecting the leader node randomly.

Communication Complexity
Communication complexity is an important index of the consensus protocols. It can be reflected by the number of communications times. The communication complexity of PBFT [14] is O(P 2 ). It means the number of communication times increases exponentially with P. In our HBFT, the communication complexity is optimized.
Suppose there are l nodes in each subgroup and the number of nodes in each subgroup is consistent. The total number of nodes P in the network is: For the nodes in the low layer, the number of communication times C1 is: For the nodes in the high layer, the number of communication times C2 is: Therefore, the gross number of communication times C in the network is: C can also be expressed in P according to Equation (17) as: One can find from Equation (18) that the gross number of communication times C increases linearly with the total number of nodes P. In other words, the developed consensus protocol can significantly reduce the communication complexity by comparison with PBFT [14]. It will be proven in the experiments later.

Byzantine Fault Tolerance
The Byzantine fault tolerance reflects the security of the consensus protocol. The more Byzantine nodes a consensus protocol can tolerate, the securer it is. The upper limit of the Byzantine node in PBFT [14] is (P − 1)/3. In our HBFT, the Byzantine fault tolerance rate is optimized. In addition to the leader node, the total number of nodes in the high layer x is: Since there is no Byzantine node in the high layer, a consensus can be reached as long as more than half of the subleader nodes reply correctly. During the consensus process in the low layer, the subleader node in each subgroup needs to receive more than one-third of the correct messages. In the worst case, half of subgroups are comprised of Byzantine nodes, and the subleader node cannot reply correctly. The maximum number of Byzantine nodes in HBFT is: Since l ≥ i3, we can conclude that the developed consensus protocol can tolerate more Byzantine nodes.

Experiments
In order to test the effectiveness of the proposed protocol, an experimental network was built as follows. Every node is running on a physical or virtual machine with equivalent performances. We initiate nodes with different data transmission delays. Every node can participate in the consensus, and no one would be excluded from the network. Each node keeps two ledgers: one for recording the reputation value in the transaction phase and the other for recording the transactions in the consensus process. For security experiments, we set some nodes as malicious ones. They can hide or tamper transaction data, deliberately underscore normal nodes and even collaborate with their partners to get high scores. The normal node always responded correctly in time.

Reputation Model Parameters
Reputation is an important index for assigning nodes in the two-layer structure. To improve the security of network, some parameters should be optimized for two purposes: one is to disable malicious nodes from getting high scores, and the other one is to reduce the negative influence that malicious nodes bring to normal nodes. In this experiment, R max was set to 1, and R min was set to 0. Since the experiment was simulated in a local area network, T max was set to 1000.

Historic Rounds
When a node scores for its transaction node, historic rounds are helpful in judging whether the node is malicious. In the experiment, we simulate three normal nodes (node0, node1 and node3) and a malicious node (node2). In order to get the optimal number of h, h is changed from 1 to 9 at an interval of 2. Additionally, we set the proportions of the transactions between three normal nodes and the malicious node with different values, which are 6.25% for node0, 62.5% for node1 and 31.25% for node3, respectively. Some results are shown in Figure 6, respectively.

Reputation Model Parameters
Reputation is an important index for assigning nodes in the two-layer structure. To improve the security of network, some parameters should be optimized for two purposes: one is to disable malicious nodes from getting high scores, and the other one is to reduce the negative influence that malicious nodes bring to normal nodes. In this experiment, Rmax was set to 1, and Rmin was set to 0. Since the experiment was simulated in a local area network, Tmax was set to 1000.

Historic Rounds
When a node scores for its transaction node, historic rounds are helpful in judging whether the node is malicious. In the experiment, we simulate three normal nodes (node0, node1 and node3) and a malicious node (node2). In order to get the optimal number of h, h is changed from 1 to 9 at an interval of 2. Additionally, we set the proportions of the transactions between three normal nodes and the malicious node with different values, which are 6.25% for node0, 62.5% for node1 and 31.25% for node3, respectively. Some results are shown in Figure 6, respectively. It can be seen from Figure 5 that the number of h has a significant influence on the reputation values of the nodes. The malicious node is successfully blocked from getting high reputation values, but the normal nodes are influenced by the malicious node to different extents with the changes of the h values. When h is not considered (h = 0), the gap of the reputation value between normal nodes is expanding. It indicates that the local reputation model without h cannot resist attacks from the malicious node well.
We selected standard deviation to quantitatively evaluate the negative influence of the malicious node towards the normal nodes. We selected reputation values of the normal nodes at 200 rounds and computed the standard deviation. The results are given in Table 1. The lower standard deviation is, the better the performance of the reputation model. It can be seen from Figure 5 that the number of h has a significant influence on the reputation values of the nodes. The malicious node is successfully blocked from getting high reputation values, but the normal nodes are influenced by the malicious node to different extents with the changes of the h values. When h is not considered (h = 0), the gap of the reputation value between normal nodes is expanding. It indicates that the local reputation model without h cannot resist attacks from the malicious node well.
We selected standard deviation to quantitatively evaluate the negative influence of the malicious node towards the normal nodes. We selected reputation values of the normal nodes at 200 rounds and computed the standard deviation. The results are given in Table 1. The lower standard deviation is, the better the performance of the reputation model. One can find from Table 1 that, when h is set to 3, the standard deviation is the minimum. In the subsequent experiments, h is set as 3 and kept the same.

Weigh Selection Distribution
The malicious nodes may expand impact of their own evaluations by fictionalizing their reputation values. In a proposed global reputation model, a k is introduced to assign each node with different weights. It is only related to the reputation values of the nodes. The value of a k is between 0 and 1. The sum of a k is 1.
Two kinds of weight selection functions will be discussed. The default one is the uniform distribution, and the other one is the linear distribution. For uniform distribution, the weights are all equal: For linear distribution, suppose k is the slope. The reputation values of each node are listed in descending order and mapped into certain equidistant values in [0, b], and b can be calculated as follows: In the experiment, we simulated a group of nodes, including multiple malicious nodes. We set f as the proportion of the malicious nodes in the network. Since the maximum value of f in PBFT [16] does not exceed one-third, f is set to one-sixth, 1/one-fourth and one-third, respectively. Some simulation results are shown in Figure 7. One can find from Figure 7a-c that the gap of the reputation values between the normal and malicious nodes is narrowed with the increase of f. When the distribution is uniform and f is one-third, the reputation value of the malicious nodes is almost the same as that of the normal nodes after 2000 rounds. It denotes that the node with a higher reputation could be malicious. One can find from Figure 7a-c that the gap of the reputation values between the normal and malicious nodes is narrowed with the increase of f. When the distribution is uniform and f is one-third, the reputation value of the malicious nodes is almost the same as that of the normal nodes after 2000 rounds. It denotes that the node with a higher reputation could be malicious.
We set the distribution as linear and f as one-third, and the results are shown in Figure 7d. The reputation value of normal nodes apparently surpasses that of malicious nodes at around 800 rounds of transaction phase. It shows that, when the distribution is linear, the malicious nodes can be well-disabled. The distribution is set as linear and remains unchanged in the later experiment.

Node Allocation Scheme
The existing research has provided many solutions for node management. For example, a smart collaborative balancing scheme was proposed in [35] to dynamically adjust the orchestration of the network and smartly allocate the bandwidth for each node. A node overhaul scheme proposed in [36] could efficiently improve the network lifetime by creating a uniform cluster with good quality. It mainly considers the size of the cluster and total intra-cluster communication distance.
In the proposed protocol, the nodes are allocated into two layers. Consensus nodes are required to be assigned to some subgroups. w is set as the ratio of consensus nodes to subleader nodes. Three different allocation situations are simply discussed, including average, random and geographic, respectively. In an average situation, each subgroup has the same number of nodes. The number of nodes in each subgroup is uncertain in a random situation. It may be 0 or infinite. In the geographic situation, the real position of the nodes is taken into consideration, which denotes that the adjacent nodes are placed in the same subgroup.
Data throughput and latency are two important indexes of the consensus protocols. Data throughput is expressed as the number of transactions per unit time (TPS): The higher data throughput denotes that the consensus protocol is more efficient. Latency represents the time difference from transaction submission to transaction confirmation. The lower latency denotes a better performance of the consensus protocols. In the experiment, the value of w is changed from 1 to 19 at an interval of 2. The total number of nodes in the network is fixed at 61. Meanwhile, the consensus nodes are allocated into different subgroups in three ways, as mentioned above. Some results are shown in Figure 8. The higher data throughput denotes that the consensus protocol is more efficient. Latency represents the time difference from transaction submission to transaction confirmation. The lower latency denotes a better performance of the consensus protocols. In the experiment, the value of w is changed from 1 to 19 at an interval of 2. The total number of nodes in the network is fixed at 61. Meanwhile, the consensus nodes are allocated into different subgroups in three ways, as mentioned above. Some results are shown in Figure 8. It can be seen from Figure 8 that the random scheme performs the worst in both data throughput and latency. In the average and geographic schemes, as the w increases, the data throughput decreases, latency increases. Both data throughput and latency are optimal when w is 1 for the discussed schemes. The performance of the geographic scheme is the best among the investigated ones. The geographic scheme is taken as the node allocation one, and w is set as 1 in the subsequent experiment. It can be seen from Figure 8 that the random scheme performs the worst in both data throughput and latency. In the average and geographic schemes, as the w increases, the data throughput decreases, latency increases. Both data throughput and latency are optimal when w is 1 for the discussed schemes. The performance of the geographic scheme is the best among the investigated ones. The geographic scheme is taken as the node allocation one, and w is set as 1 in the subsequent experiment.

Comparisons with Relevant Consensus Protocols
In order to further evaluate whether the proposed protocol is efficient, the proposed protocol was compared with PBFT [14] and T-PBFT [31] in communication times, data throughput and latency. The less the communication times, the higher the node scalability of a blockchain is. The number of transactions is set as 3000. The number of nodes is changed from 4 to 40 at an interval of 3. Some results are given in Figure 9. It can be seen from Figure 8 that the random scheme performs the worst in both data throughput and latency. In the average and geographic schemes, as the w increases, the data throughput decreases, latency increases. Both data throughput and latency are optimal when w is 1 for the discussed schemes. The performance of the geographic scheme is the best among the investigated ones. The geographic scheme is taken as the node allocation one, and w is set as 1 in the subsequent experiment.

Comparisons with Relevant Consensus Protocols
In order to further evaluate whether the proposed protocol is efficient, the proposed protocol was compared with PBFT [14] and T-PBFT [31] in communication times, data throughput and latency. The less the communication times, the higher the node scalability of a blockchain is. The number of transactions is set as 3000. The number of nodes is changed from 4 to 40 at an interval of 3. Some results are given in Figure 9. One can find from Figure 9 that the performance of our developed HBFT is the best for the communication times, data throughput and latency among the investigated protocols [14,31]. Some reasons are as follows: the multiple confirmation phase is introduced in [14] to ensure the consistency and correctness of the final block in the presence of malicious node. However, it would greatly increase the communication times. The eigentrust model is utilized to reduce the communication times in [31]. However, it does not basically solve the problem by only allowing nodes with a high reputation to participate in the consensus process. It would cause centralization of the network, which violate the guideline of blockchain. Our developed protocol can be applied to reduce the communication times by the hierarchy structure. The proposed hierarchy structure makes nodes only need to exchange messages with their related nodes. The reputation model is utilized to assign some reliable node to the high layer, where the confirmation phase is shortened. It effectively increases the data throughput and reduces the latency and proposed leader selection mechanism, which can ensure the security of the network by the fairly selected leader node.
In order to get fairer results, some nonquantitative indicators are taken into account. The results are based on the best parameters provided by the authors in their protocols [12,14,25,31]. Some comparison results are shown in Table 2. It can be found from Table 2 that the proposed HBFT has an excellent performance among the investigated protocols [12,14,25,31]. Some reasons are as follows: DPOS [12] cannot tolerate the Byzantine faults due to the lack of multiple validation during block generation. PBFT [14] is inefficient in the network composed of large-scale nodes with the communication complexity of O(P 2 ). DBFT [25] and T-PBFT [31] reduce the degree of de-centralization to some extent. Our proposed protocol can reduce the communication complexity to O(P) with a high degree of decentralization. Additionally, the developed protocol selects the leader node randomly and fairly.

Conclusions
A novel consensus protocol HBFT based on the node reputation was proposed. A hierarchy structure has been developed to separate nodes into two layers. It deducts the communication times between nodes and well improves the scalability. A combination of local and global reputation models has been proposed to evaluate the behaviors of nodes in the network. Malicious nodes are disabled from getting into the high layer, which enhances the security in the network and speeds up the consensus in the high layer. Additionally, a random selection mechanism was proposed to ensure the fairness of the leader node. Some experimental results highlight that the proposed consensus protocol has an excellent performance in comparison to some state-of-the-art models. Compared with PBFT [14] and T-PBFT [31], the proposed protocol shows a better performance in low communication complexity, low latency and high throughput. Additionally, it can tolerate more Byzantine nodes and maintain high degrees of decentralization. For future works, we will continue to increase the number of layers in this hierarchy structure and develop a more effective reputation model that can disable malicious nodes in a high proportion.
Author Contributions: Data curation, X.W. and Y.G.; Formal analysis, X.W. and Y.G.; Funding acquisition X.W. and Methodology, X.W. and Y.G. All authors have read and agreed to the published version of the manuscript.