1. Introduction
Traditional marine cold chain traceability systems often store data related to various stages in centralized databases. The storage method does not guarantee transparency between enterprises, connectivity of upstream and downstream information, or the security and reliability of data. Fraud in marine cold chain products is often reported, which could impact food safety [
1]. Developing a trustworthy, transparent, and decentralized marine cold chain traceability system is a crucial research topic that urgently needs to be solved by the academic community and food regulatory agencies [
2].
The marine cold chain traceability system is specifically designed to trace and monitor the entire process of cold chain products in the marine environment, from fishing and processing to transportation [
3]. The system ensures that fishery products are always in optimal condition during transportation and storage by monitoring and recording parameters [
4,
5]. However, traditional marine cold chain traceability systems often face problems such as data opacity, information silos, and traceability difficulty, leading to food safety hazards. Blockchain technology is fundamentally a distributed ledger that combines multiple core technologies, including P2P networks, consensus mechanisms, cryptographic algorithms, and smart contracts [
6]. It features decentralization, immutability, and traceability [
7]. The concept of blockchain was first proposed by a scholar named Satoshi Nakamoto in 2008 [
8]. With the development of related technologies, blockchain has incorporated smart contracts and cryptographic algorithms. Due to the high reliability and decentralization of blockchain, combining it with traditional marine cold chain traceability systems can solve many problems faced by traditional cold chain traceability systems [
9]. It is also applied to fields such as cybersecurity [
10], healthcare [
11], cold chain management [
12], and agricultural traceability [
13], offering new solutions for traditional fishery product cold chain traceability problems [
14]. Based on different application scenarios and participant permissions, blockchain is categorized into public, private, and consortium chains [
15].
Public chains are fully decentralized, allowing any individual or organization to participate in or exit the network. Private chains are used within specific entities or organizations, with participation requiring authorized permission [
16]. Consortium chains, which lie between public and private chains, consist of a group of known and trusted nodes, and any node or organization must obtain authorization to join. Consortium chains provide participants with a secure, efficient, and trustworthy environment, ensuring transparency and information sharing among organizations [
17]. Compared to public chains, consortium chains have advantages such as high throughput, fast transaction speed, and privacy protection. The comparison of the three types of blockchain is shown in
Table 1 below [
18].
Each of the three blockchain types has its unique application scenarios and advantages. In the marine environment, limited communication, computing resources, and unstable environments on fishery vessels affect data transmission and processing and lead to system performance degradation and operational delays, impacting overall efficiency and reliability [
19]. Current research shows that consortium chains reduce the complexity and bandwidth consumption of data transmission by limiting the number of participating nodes and selecting trusted nodes [
20]. Therefore, consortium chains are promising for cold chain traceability scenarios. Leveraging the unique characteristics and advantages of consortium chains, many researchers have begun to apply them to traditional fishery product cold chain traceability systems, thereby facilitating their widespread application in the field. Zhang et al. developed a fishery product traceability system based on blockchain and the Internet of Things (BIOT-TS), which enhanced data management efficiency and ensured data security and reliability. However, the BIOT-TS system did not optimize the consensus algorithm based on actual scenarios. As the number of nodes increases, the system’s response slows down when dealing with a large number of transactions, affecting traceability efficiency [
21]. Facing challenges such as traceability difficulties and data tampering in the fishery supply chain, P.K. and colleagues improved the traceability efficiency and the quality safety of fishery products by integrating blockchain technology with traditional fishery product cold chain systems. Although it solves the problems of traceability difficulty and data tampering, it ignores the optimization of the consensus algorithm, resulting in high communication overhead [
22]. Syam, M.M. and others used mini containers for the cold chain transportation of vegetables and fruits, reducing energy consumption and greenhouse gas emissions, and avoiding food and energy waste caused by cold chain failures. Although the safety issues of cold chain food have been resolved, further improvement is needed in terms of cold chain traceability efficiency [
23]. M and others applied blockchain technology to the fishery supply chain, constructing a traceability system that reduced the possibility of data tampering, which enhanced food safety. But the consensus algorithm is not optimized, resulting in low efficiency [
24]. Liu and Yu used the theory of stochastic Petri nets to construct a blockchain e-commerce cold chain traceability system model, solving the problem of difficult e-commerce cold chain traceability. However, the model not only has high complexity but also fails to consider consensus algorithm optimization, which may result in latency issues when handling transactions [
25]. The above studies highlight the application of blockchain with traditional traceability systems. They seldom explore the security and efficiency problems of consensus algorithms in traditional cold chain traceability systems. In the practical environment of the marine fishery, the distance between fishery vessels, data size, and refrigeration temperature factors of nodes can affect communication quality and data transmission. Directly applying blockchain to marine cold chain traceability systems faces problems of low throughput, long transaction latency, and high communication overhead.
Consensus algorithms are the cornerstone of ensuring the security and reliability operation of blockchain networks [
26]. Their primary function is to ensure that every node in the blockchain network records and stores data according to unified standards [
27]. Current research shows that the PBFT algorithm has good fault tolerance in consortium chains. Therefore, the PBFT algorithm is one of the most widely applied consensus algorithms in consortium chains [
28]. Compared to other consensus algorithms like Raft [
29] and Kafka [
30], PBFT demonstrates superior fault tolerance in the presence of Byzantine nodes. However, in the PBFT algorithm, the selection of the primary node is usually conducted through a round-robin mechanism, which could potentially allow malicious nodes to become primary nodes. If a malicious node becomes the primary node, it can pose a serious threat to the entire network’s consensus algorithm [
31]. Moreover, as the number of participating nodes in the marine cold chain traceability system increases, using the traditional PBFT algorithm will significantly increase the communication overhead of the system, making it unsuitable for large-scale environments [
32].
To solve the problems of low throughput, long transaction latency, and high communication overhead in the marine cold chain traceability scenarios, we propose an improved consensus algorithm—NR-PBFT. Firstly, a node grouping scheme is designed based on a consistent hash algorithm to reduce the number of consensus nodes. Then, a reputation evaluation model is proposed, considering the distance between fishery vessels, data size, and refrigeration temperature factors in marine scenarios, to select highly reliable nodes. Finally, the node grouping scheme is adopted to optimize the consensus protocol processing of the PBFT algorithm in the preparation stage, submission stage, and response stage. NR-PBFT improves the throughput and reduces the transaction latency of the original PBFT algorithm. Some properties of the study which differ from the previous ones are summarized as follows:
- (1)
An improved PBFT consensus algorithm is proposed for a blockchain-based marine fishery cold chain traceability system.
- (2)
A node grouping scheme is designed based on a consistent hashing algorithm. It reduces the number of consensus nodes, improves the scale supported by the blockchain network, and optimizes the consensus protocol processing of the PBFT to reduce the communication overhead.
- (3)
To select highly reliable nodes, a reputation evaluation model is proposed. It updates the selection mechanism of consensus and leader nodes to reduce the probability of malicious nodes becoming leader nodes and the transaction latency.
The remainder of the paper is organized as follows:
Section 2 describes the design of a blockchain-based cold chain traceability system for marine fishery vessels;
Section 3 proposes an NR-PBFT algorithm;
Section 4 analyzes the results of experiments; and
Section 5 provides corresponding experimental conclusions and future work directions.
3. Design of NR-PBFT Consensus Algorithm
Castro M. and Liskov B. proposed the PBFT algorithm in a paper titled “Practical Byzantine Fault Tolerance” [
33]. The algorithm reduces the complexity of the Byzantine fault-tolerant algorithm from the exponential to polynomial level and is considered one of the best algorithms to solve Byzantine problems. It consists of a consensus protocol, view change protocol, and checkpoint protocol [
34]. However, in the practical marine fishery environment, the PBFT algorithm still has some limitations, including low throughput, lack of an effective reputation evaluation model, and the arbitrariness of the primary node selection process. These drawbacks not only increase the communication cost of the algorithm but also reduce the consensus enthusiasm of nodes, as it cannot reward loyal nodes or punish malicious nodes. At the same time, malicious nodes may become the primary nodes, further reducing the consensus efficiency of the entire system [
35].
Aiming at the above problems, we propose a Node-grouped and Reputation-evaluated PBFT (NR-PBFT). The NR-PBFT algorithm consists of a node grouping scheme, consensus process, reputation evaluation model, and Byzantine node detection mechanism.
3.1. Node Grouping Scheme
As shown in
Figure 4, initially, the identifiers (vessel number) of the marine fishery vessel nodes are hashed to generate hash values. Subsequently, multiple virtual nodes are introduced for each actual node to enhance the uniformity of the hash ring, thereby avoiding data skew problems caused by the uneven distribution of actual nodes. Following this, both actual nodes and virtual nodes are arranged on the hash ring according to their hash values, forming an orderly hash ring [
36]. Finally, nodes are grouped based on their positions on the hash ring. The blockchain network comprises N nodes in total, with the number of consensus nodes denoted as A and the number of candidate nodes as B, maintaining a ratio of A:B = 3:2. The nodes are divided into m groups, with m nodes randomly selected to serve as the initial leader for each group. Nodes within the candidate node set only update their local states upon reaching consensus and temporarily do not participate in the consensus process. Assuming that there are a total of 30 nodes in the blockchain network divided into five groups, the grouping process of nodes is presented in
Figure 5.
3.2. NR-PBFT Consensus Process
The overall process of NR-PBFT is illustrated in
Figure 6. Initially, all marine fishery vessel nodes are initialized and grouped into consensus and candidate sets. During the consensus process, groups validate and achieve consensus on request from clients. If a group timeout occurs, the process is reinitialized; if not, each group randomly selects a leader node. The group first reaches consensus within the consensus set, and then the leader node participates in consensus outside of the consensus set. It checks whether the consensus processing times out; if the transaction of any node times out, it is added to a danger list. If no timeout occurs, the consensus processing concludes. After the consensus concludes, each node’s score and state are updated. Then, the system’s leader node is optimally selected to reduce the possibility of Byzantine nodes becoming leader nodes.
3.3. Reputation Eevaluation Model
In marine fishery satellite communication, the distance between fishery vessels, data size, and refrigeration temperature factors of nodes can affect the communication quality. Therefore, to analyze the impact of factors on communication quality, we consider factors for the reputation score calculation formula.
As for NR-PBFT, each node starts with an initial reputation score of 60. The higher the reputation score, the more trustworthy the node is considered. Nodes are categorized into five status levels based on their reputation scores: Excellent, Good, Normal, Abnormal, and Malicious. According to the different status of each node, corresponding weight coefficients are different, as shown in
Table 3.
Normal nodes have a reputation score between 70 and 80, which is the minimum standard to ensure participation in consensus. Nodes with scores below the range will be expelled from the blockchain system. Rewards and penalties for nodes are allocated based on their performance during the consensus process and their respective reputation levels. The specific formula for calculating the reputation score of a node is as follows:
If the leader node or any node within the group fails to participate in the consensus outside the group, or if the nodes within the group do not successfully synchronize data, the reputation score calculation formula is expressed by (1):
When a node sends different messages to different nodes or incorrect messages, the reputation score calculation formula is expressed by (2):
- (2)
Reward Rules
The parameters that affect satellite communication as standardized, such as the distance between fishery vessels, data size, and refrigeration temperature, to be between 0 and 1:
The standardized formula for the distance
between fishery vessels is expressed by (3):
where
= 50, unit: nmi.
The standardized formula for the size
of data is expressed by (4):
Here, = 300, unit: Byte.
The standardized formula for refrigeration temperature
is expressed by (5):
Here, = −30 °C and = 10 °C.
When a new block is generated, the reputation score of the node based on various factors is expressed by (6), and each factor is expressed by (7) and (8), respectively:
In this context, denotes the reputation score of node , refers to the round of consensus, and is the penalty index, which is set to 0.8 in this study.
To ensure that the reputation score
is between 0 and 100, the reputation score calculation formula is expressed by (9):
Although the individual performance of nodes is crucial when selecting a leader node, the level and weights of the nodes within the group help to balance the competitive relationships between nodes, ensuring that the chosen leader node has higher overall credibility and reliability. The final reputation score calculation formula for the leader node is as follows:
Within the framework, serves as a proportionality constant, . A greater score represents the possibility of a node being elected as the leader, which is instrumental in avoiding the participation of Byzantine nodes in the consensus process, improving reliability and robustness.
3.4. Optimized Consensus Protocol
The NR-PBFT consensus algorithm includes the original PBFT’s prepare, commit, and reply phases and further divides the three stages into intra-group and out-group processes, as illustrated in
Figure 7. The optimized consensus protocol of NR-PBFT comprises seven steps, described as follows:
The client C sends a transaction message to each leader node of the consensus set, where represents the transaction content, is the timestamp, is the client identifier, and is the client’s signature.
Upon receiving the transaction message, the leader node verifies and orders the client message. Subsequently, it encapsulates this into a prepare message designated for the consensus set , and broadcasts it to intra-group members. Here, is the message sequence number, is the view number, is the message digest, and is the signature of the leader node .
After receiving the message, intra-group members verify its correctness and reasonableness. If it passes verification, they will send a commit message, , back to the leader node, where is the signature of the intra-group node .
Once the leader node receives a sufficient number of confirmations from the consensus set members and verification is successful, it signifies that the intra-group consensus is complete. The leader node then participates in the global consensus on behalf of each consensus set and broadcasts the prepared out-group message to all leader nodes.
Each leader node receives and verifies the incoming messages. If a message fails verification, it is cleared. If more than validated messages are received, then an message is broadcast to other leader nodes.
Each leader node validates the received messages, and upon receiving more than validated messages, it broadcasts the packaged to the intra-group nodes.
The leader node finally responds to the client with , marking the end of the consensus process.
3.5. Byzantine Node Detection Mechanism
In the blockchain network of the marine fishery, nodes are categorized into leader nodes, replica nodes, candidate nodes, monitor nodes, and Byzantine nodes. Leader nodes represent the consensus group in external consensus participation. Replica nodes first achieve consensus within the group before broadcasting the consensus results to the leader nodes. Candidate nodes are potential consensus participants. Monitor nodes primarily coordinate view changes within the network and collect reputation information sent by the leader nodes. Byzantine nodes represent those that are malfunctioning or exhibiting malicious performance due to system failures.
The mechanism’s overall flow is shown in
Figure 8 below. The Byzantine node detection mechanism is initiated before the start of the next consensus round. The leader node sends a probe message
to all nodes;
is the timestamp,
is the content of message,
represents the identifier of the leader node, and
is the leader’s signature.
If the leader node receives a response from any node, it is necessary to verify the authenticity and consistency of the response. The process starts with the validation of the node’s signature to ensure that the message is indeed from that node and has not been altered. Subsequently, the content of the messages from different nodes is compared to check for consistency. Nodes that provide consistent responses are rewarded accordingly; if the leader node does not receive a response or receives a malicious message, appropriate penalties are applied. Following this, the scores and statuses of the nodes are updated.
4. Experimental Results and Discussion
4.1. Experimental Setup and Parameters
4.1.1. Experimental Scenario
To verify the performance of the improved PBFT algorithm in the constructed marine cold chain traceability system, we carried out experiments on marine fishery vessels from Zhoushan Ningtai Marine Fisheries Co., Ltd. in Zhoushan, Zhejiang Province, China. Marine fishery vessel nodes communicate with the shore-based server by Iridium satellites to upload the related data. Marine fishery vessels operate in the North Pacific, Southeast Pacific, Southwest Atlantic, and Indian Ocean regions. These vessels are equipped with advanced temperature control systems to ensure the freshness and high quality of the catch, as illustrated in
Figure 9.
Figure 9a displays the temperature data, number of packages, and capacity of the warehouse. The red number indicates an alarm when the temperature exceeds the upper and lower limits, while the white number indicates a normal temperature.
4.1.2. Experimental Parameters
PBFT, KBFT, GPBFT, and the proposed NR-PBFT are applied to the marine cold chain traceability system for comparative analysis. The comparison focuses on four key metrics: transaction latency, throughput, communication overhead, and security. The initial reputation score for all nodes is set to 60. During the experiments, clients concurrently sent 500 transactions, with every 50 transactions being packed into a single block. The configuration details for the experiments are shown in
Table 4.
4.2. Transaction Latency
In a blockchain system, transaction latency refers to the time it takes for a transaction request from a client node to reach consensus among all nodes in the blockchain network. In the experiments, the number of nodes increases from 20 to 60, with a step size of 5. To ensure generality, 20 transactions are repeated under different numbers of nodes, and the final transaction latency value for different numbers of nodes is the average of 20 transactions. A lower transaction latency implies a faster consensus process. The latency formula is expressed as follows:
In Equation (11), represents the transaction latency of transaction denotes the time taken for the block containing the transaction to be confirmed, and indicates the time at which the client sends the transaction.
Figure 10 illustrates the variation in transaction latency for PBFT, KBFT, GPBFT, and the proposed NR-PBFT as the number of nodes in the network increases. The abscissa represents the total number of nodes in the network, while the ordinate represents the transaction latency.
It is observed that at the same node count, PBFT, KBFT, and GPBFT exhibit a higher transaction latency compared to NR-PBFT. As the number of nodes varies while processing the same volume of transactions, the transaction latency for all algorithms increases with the number of nodes. The transaction latency for the improved PBFT consistently remains lower than that of PBFT, KBFT and GPBFT. When the number of nodes is below 35, the transaction latency for the four consensus algorithms ranges between 100 ms and 500 ms. As the number of nodes in the network continues to grow, the transaction latency of PBFT, KBFT, and GPBFT rapidly increases. When the number of nodes reaches 60, the transaction latency of PBFT increases to 3226 ms, the transaction latency of KBFT increases to 934 ms, and the transaction latency of GPBFT increases to 754 ms. In contrast, NR-PBFT maintains a transaction latency of 583 ms, which is 81.92%, 37.58%, and 22.67% lower than PBFT, KBFT and GPBFT, respectively.
Due to the lack of an effective node grouping scheme and the optimization of consensus protocol, the transaction latency of PBFT, KBFT, and GPBFT increases rapidly with the increase in the number of nodes. In contrast, the NR-PBFT algorithm, with its improved node grouping scheme and optimized consensus protocol, limits the selection of leader nodes to those with high credibility. It not only reduces the possibility of Byzantine nodes becoming leader nodes but also decreases the frequency of view changes, thus maintaining a lower transaction latency. Therefore, under the same network conditions, the transaction latency of PBFT, KBFT, and GPBFT is higher than that of NR-PBFT.
4.3. Throughput
Throughput refers to the number of transactions that a system completes within a unit of time. It is a key indicator for evaluating the transaction processing speed of a blockchain and is typically denoted by Transactions Per Second (TPS). In the experiment, the number of nodes is set to increase from 20 to 60, with a step size of five, and for generality, 20 transactions are repeatedly conducted under different numbers of nodes. The final transaction latency value for different numbers of nodes is the average of these 20 transactions. The calculation formula is as follows:
In Equation (12), represents the total number of transactions and denotes the total time spent on the exchange. The unit of throughput is .
To evaluate the performance of NR-PBFT in terms of throughput, the study compares PBFT, KBFT, GPBFT, and NR-PBFT across different numbers of nodes. The higher throughput indicates that the network can process more transactions, demonstrating greater processing efficiency.
Figure 11 shows the trends in throughput changes for PBFT, KBFT, GPBFT, and NR-PBFT as the number of nodes increases. The abscissa represents the total number of nodes in the network, while the ordinate represents the throughput.
It can be seen that, under the same experimental conditions, NR-PBFT consistently exhibits higher throughput than the other three algorithms. When the number of nodes is 20, the difference in throughput among the four algorithms is the most significant. The throughput of NR-PBFT is 106TPS higher than GPBFT, 191TPS higher than KBFT, and 399TPS higher than PBFT, respectively. As the number of network nodes increases, the throughput for all algorithms shows a declining trend. When the number of nodes reaches 60, PBFT’s throughput is 27TPS, KBFT’s is 101TPS, GPBFT’s is 126TPS, and NR-PBFT’s is 171TPS, which is 84.21%, 40.93%, and 26.31% higher than PBFT, KBFT, and GPBFT respectively.
Due to the lack of a specific reputation evaluation model, PBFT, KBFT, and GPBFT are unable to effectively reward or penalize nodes. This deficiency prevents the system from prioritizing consensus nodes and leader nodes with good reputations, thereby failing to achieve improvements in transaction processing speed and efficiency. The reputation evaluation model of NR-PBFT is designed specifically for traceability scenarios in marine cold chains. Considering the distance between fishery vessels, refrigeration temperature, and data size in the traceability scenario of the marine cold chain, the reliability and robustness of the system are improved.
4.4. Communication Overhead
Communication overhead refers to the amount of communication generated by nodes during the consensus process. The NR-PBFT consensus algorithm reduces communication overhead by decreasing the number of consensus nodes and optimizing the consensus protocol within the PBFT consensus algorithm.
4.4.1. Analysis of PBFT Communication Overhead
The communication overhead of PBFT mainly occurs during the pre-prepare, prepare, commit, and reply phases. Assuming there are nodes in the system, during the pre-prepare phase, the primary node communicates times; during the prepare phase, each node sends a message to all other nodes except itself, resulting in communications; during the commit phase, each node broadcasts a message to all other nodes, amounting to communications.
Let the total number of communications in PBFT be denoted as
; the calculation is as follows in Equation (13):
4.4.2. Analysis of NR-PBFT Communication Overhead
The improved PBFT adds a grouping phase, and the node consensus process includes stages such as intra-group prepare, intra-group commit, out-group prepare, out-group commit, intra-group response, and out-group response. Assuming there are consensus sets in the system, during the intra-group prepare phase, the number of communications is ; in the intra-group commit phase, each member of the consensus set sends a message to the lead node for commit, and the number of communications is ; in the out-group prepare phase, each lead node broadcasts a prepare message to other lead nodes, and the number of communications is ; and in the intra-group commit phase, each lead node verifies the received messages and broadcasts them to other lead nodes, and the number of communications is .
Let the total number of communications in NR-PBFT be denoted as
; the number of communications in NR-PBFT is as follows:
Assuming , where is any positive integer greater than 3, and denotes the number of members in the consensus set ( when , .
Figure 12 shows the change in consensus communication overhead of four consensus algorithms when there are more and more system nodes.
The abscissa represents the total number of nodes in the network, while the ordinate represents the communication overhead. When the number of nodes is 20, the difference in communication overhead among the four algorithms is the smallest. The communication overhead of NR-PBFT is 116 times lower than GPBFT, 257 times lower than KBFT, and 536 times lower than PBFT. As the number of nodes increases, the communication overhead of the four algorithms tends to increase. The communication overhead of PBFT increases the fastest, while the growth rate of NR-PBFT is the slowest. When the number of nodes reaches 60, PBFT’s communication overhead is 7080, KBFT’s is 1350, GPBFT’s is 1047, and NR-PBFT’s is 748, which is 89.4%, 44.5%, and 28.5% lower than PBFT, KBFT, and GPBFT, respectively. The reduction in communication overhead in the marine cold chain traceability system not only improves the traceability efficiency of the system but also reduces the operating costs of the system.
4.5. Security
Due to the lack of a reputation evaluation model, the PBFT consensus algorithm fails to effectively penalize Byzantine nodes, allowing them to persist in the blockchain network and thus weakening the overall system security. To solve the problem, the NR-PBFT algorithm introduces a Byzantine node detection mechanism. Through this mechanism, once the system detects a malicious node, it immediately reduces the node’s reputation value to zero and removes it from the blockchain network. It reduces the possibility of Byzantine nodes becoming primary nodes, thereby enhancing the security of the blockchain network. The comparison of the number of Byzantine nodes in the PBFT and NR-PBFT consensus algorithms is shown in the following
Figure 13.
To evaluate the performance of NR-PBFT in terms of security, Byzantine nodes are deliberately introduced into the system, and the trend of the number of Byzantine nodes changes with the increase in consensus rounds observed. The total number of nodes is set to 45, and the NR-PBFT consensus algorithm randomly divides the nodes into groups using the consistent hashing algorithm, with the ratio of consensus node set to candidate node set being 3:2, resulting in 27 consensus nodes. Since the PBFT and NR-PBFT consensus algorithms all have a certain fault tolerance to Byzantine nodes, the number of Byzantine nodes is set to six. It is assumed that the six Byzantine nodes are initially consensus nodes; the number of consensus rounds increases from 0 to 30, with a step size of five.
The experimental results show that as the number of consensus rounds increases, PBFT, due to the lack of a reputation evaluation model and Byzantine node detection mechanism, maintains a constant number of Byzantine nodes. NR-PBFT not only limits the selection of nodes to those with high reliability but also takes advantage of a Byzantine detection mechanism to detect malicious nodes, thereby reducing the possibility of malicious nodes becoming leading nodes. The improvement not only reduces the risk of Byzantine attack systems but also enhances the overall stability and reliability of the system.