1. Introduction
Blockchain technology first appeared in a paper [
1] by Satoshi Nakamoto, published in November 2008. This paper describes blockchain technology as a decentralized distributed ledger technology that enables data to be recorded and stored securely, traceably, and transparently while remaining tamper-proof. The paper also introduced the world’s first decentralized digital cryptocurrency—Bitcoin. Although initially developed as a transaction platform for cryptocurrency, the impact of blockchain technology extends far beyond this. Its applications are no longer confined to finance, with fields such as the Internet of Things, healthcare, education, and the arts inevitably being transformed by this technology. As blockchain technology advances rapidly and gains widespread adoption, concerns about its security are also receiving increasing attention [
2,
3,
4,
5,
6,
7]. Among these, mining attacks represent one of the most severe threats to the security of blockchain systems [
8]. By exploiting vulnerabilities in blockchain consensus mechanisms, attackers can launch mining attacks that disrupt the system’s incentive balance and potentially cause the entire blockchain system to collapse.
Many mainstream cryptocurrencies, such as Bitcoin, Dogecoin, and Litecoin [
9], rely on the PoW consensus mechanism [
10,
11]. In PoW systems, miners compete to solve computationally intensive puzzles in order to validate transactions and secure the network. As a reward, they receive block subsidies and transaction fees. Although PoW-based blockchains integrate cryptographic techniques and decentralized architectures to enhance security, they are not immune to all forms of attack, particularly strategic attacks targeting the consensus mechanism and incentive model. The “permissionless” nature of blockchain provides a relatively low barrier to entry for malicious actors.
PoW consensus relies on the fundamental principle of symmetry, whereby all participants compete on a level playing field. However, this principle is vulnerable to symmetry-breaking attacks [
12] that exploit private chain withholding to create information asymmetry and gain unfair advantages. Building on this concept, our work introduces hybrid strategies that induce profound behavioral and structural asymmetries through Bribery–Stubborn Mining and multi-fork orchestration. These tactics pose more complex threats to the network.
The earliest view held that miners could maximize their mining rewards by strictly following the blockchain’s consensus rules, which was believed to be the optimal strategy [
13]. However, this conventional belief was challenged in 2013 with the emergence of the selfish mining attack [
14], demonstrating that miners could gain additional revenue by deliberately deviating from the protocol. This opened the door to a series of increasingly sophisticated strategies, including the more aggressive stubborn mining (SbM) [
15], block withholding attacks (BWH) [
16], and various hybrid approaches.
Among these, hybrid attacks have proven particularly potent. While strategies like bribery–selfish mining (BSM) [
17] demonstrated the effectiveness of combining economic incentives with protocol exploits, the potential of pairing bribery with the more aggressive stubborn mining strategy [
15] remains a critical and unexplored area of research.
This paper argues that this specific combination has been underexplored and poses a unique threat. Unlike selfish mining, stubborn mining’s tactic of maintaining a private chain even when falling behind prolongs fork competitions, which amplifies the decisive role of bribes in branch selection. Therefore, we first propose the Bribery–Stubborn Mining (BSM) strategy, where attackers selectively bribe target miners to collaborate during these prolonged forks. This hybrid attack is highly effective and yields additional revenue, but it is constrained by a revenue ceiling. To obtain higher revenue, we introduce the Leading Hidden Bribery–Stubborn Mining (LHBSbM) strategy. By delaying the release of key blocks to construct a triple fork, thereby breaking the traditional double-fork framework, LHBSbM can simultaneously orphan two public blocks, securing additional lead time for the attacker’s private chain and increasing the overall expected profit. This paper is an extended version of our previous conference paper [
18].
The key contributions of this paper are as follows:
We propose a hybrid strategy for Bribery–Stubborn Mining that integrates economic bribery with the nonconceding rules of stubborn mining. Through simulation analysis, we derive the criteria under which it is optimal for a rational miner to accept a bribe.
We propose an attack strategy termed Leading Hidden Bribery–Stubborn Mining (LHBSbM), which creates triple-fork scenarios. By constructing a triple fork, the attacker can orphan blocks mined by target miners, increase the effective block share, extend the private chain lead, and consequently increase the expected revenue share.
We develop a Markov chain-based [
19] analytical model to evaluate the expected rewards and incentive compatibility for both attackers and target miners.
We conduct extensive simulations to validate the effectiveness of the proposed strategies compared with traditional mining approaches.
The remainder of this paper is organized as follows.
Section 2 reviews several classes of attacks that exploit weaknesses of Proof-of-Work consensus.
Section 3 introduces the proposed Bribery–Stubborn Mining (BSbM) and Leading Hidden Bribery–Stubborn Mining (LHBSbM) strategies. We present their attack models, state transition rules, and formalize the theoretical analysis using a Markov chain-based framework.
Section 4 details the simulation setup and evaluates the performance of the proposed strategies under various conditions. Comparisons are made against honest mining and traditional stubborn mining in terms of reward efficiency and chain control.
Section 5 primarily reviews our work, discusses potential defenses against such attacks, and outlines directions for future research.
Section 6 concludes our work.
2. Related Work
In each round of new block creation within a blockchain system, every node has the opportunity to package transactions and create a new block. However, if multiple nodes generate different new blocks simultaneously, it leads to a blockchain fork, which prevents consensus from being achieved. Therefore, one node must be selected to lead the consensus process; the new block constructed by this node will be recognized by the blockchain network. To reduce the likelihood of malicious nodes leading the consensus process, leadership can be assigned at random. However, simple random assignment is insecure; for example, attackers could launch a Sybil attack [
20] by impersonating multiple nodes to increase their chances of gaining consensus leadership. This challenge led to the development of the Proof-of-Work (PoW) consensus mechanism. In PoW, nodes compete by expending computational resources to compute a hash value. The node that successfully calculates the hash value gains leadership rights. This mechanism effectively transforms simple random assignment into a proportional representation of a node’s computational power. In blockchain systems based on the PoW consensus mechanism, the scale of the computational power possessed by miners is the key determinant of mining success. In other words, the greater a miner’s computational power, the greater their probability of mining a new block and receiving the corresponding reward. However, in such decentralized, reward-driven blockchain systems lacking effective oversight, new mining attack strategies continue to be developed.
Based on different mining attack methods, mining attacks against Proof-of-Work (PoW) blockchains can be grouped into four categories: (i) 51% attack, (ii) selfish mining attack, (iii) block withholding attack, and (iv) denial-of-service (DoS) attack.
2.1. Selfish Mining Attack
Selfish mining, proposed by Eyal et al. [
14] in 2013, is an attack strategy that manipulates block publication so that the attacker’s profitability threshold falls below that of a 51% attack. Upon finding a block, the attacker withholds it instead of broadcasting immediately, keeps it on a private chain, and releases it at strategically advantageous moments to override blocks mined by honest participants. Building on this idea, Nayak et al. [
15] introduced a more aggressive family termed stubborn mining, whose core principle is to avoid abandoning the private chain. They describe three cases: lead stubborn mining (if an honest block appears while the private chain is already two blocks ahead, the attacker continues mining privately rather than publishing to override), Competitive Stubborn Mining (when the private and public chains are tied and a fork exists, the attacker keeps mining privately instead of overriding), and trail stubborn mining (the attacker persists on the private chain even when lagging) [
15].
In terms of optimization, Sapirshtein et al. [
21] proposed an optimal selfish mining policy using reinforcement learning, modeling the miner as an intelligent agent that learns the best action for each state via iterative updates; in theory, the resulting policy approaches the upper bound of selfish mining profitability. Zhang et al. [
22] extended this reinforcement learning approach to stubborn mining, designing adaptive policies that respond to fork conditions. Beyond single-strategy optimization, hybrid attacks have been explored. Gao et al. proposed a bribery–selfish mining strategy that augments blocks with additional fee incentives to recruit target miners [
17]. Yang et al. [
23] modeled bribery-based selfish mining as a Markov decision process, analyzing dynamic rewards and learning an optimal bribery–selfish policy. Multi-attacker behavior has also been studied: Azimy et al. [
24] evaluated attack efficiency with a blockchain multi-attacker simulator, and Bai et al. [
25] examined the broader impact of multiple selfish miners on system stability. Additional variants include Jeyasheela et al.’s Q-learning-based selfish mining strategy [
26] and an undetectable selfish mining design by Bahrani et al. [
27] that aims to suppress orphan-block anomalies and evade detection.
To counter selfish mining, a range of defenses has been proposed. Billah [
28] introduced Freshness Preferred, which penalizes withheld blocks by monitoring timestamps and reducing rewards for delayed publication, thereby raising the selfish mining threshold from about 25% to 32% under their model. On the detection side, Saad et al. [
29] analyze transaction volume, serial numbers, and mining costs to flag anomalous behavior, and Wang et al. [
30] showed that uncle block mechanisms can reduce the profitability of selfish and stubborn mining. Other mitigation strategies include Bicer et al.’s timestamp-based defenses [
31] and Lee et al.’s “detective” strategy that checks whether a pool’s claimed previous-block hash has already appeared on-chain [
32]. Additional proposals include Habib et al.’s virtual block mechanism that constrains release timing [
33], Wang et al.’s SM-NEEDLE deep learning detector [
34], and Nikhalat-Jahromi et al.’s AI-based defense that assigns time-dependent weights to downrank likely withheld blocks [
35].
2.2. The 51% Attack
The 51% attack, first described by King et al., is among the earliest mining attacks and features both the highest entry threshold and the greatest destructive power. Once an adversary controls more than 50% of the network’s total hash power, it can dictate block production and chain selection, effectively seizing control of the blockchain and undermining decentralization. The attacker can initiate reorganizations at any height, create forks, and roll back transactions, thereby enabling double spending [
36] (i.e., invalidating previously confirmed transactions). Under idealized assumptions, the attacker’s profit is theoretically unbounded.
Since the advent of blockchain technology, the 51% attack has attracted sustained attention as a potential security threat. Miller et al. [
37] report a centralizing trend in Bitcoin’s hash power, suggesting a rising likelihood of 51% attacks. Karame et al. [
36] demonstrate the practical feasibility of double spending in Bitcoin’s fast-payment scenarios. To counter 51% attacks, several defenses have been proposed. Yang [
38] designs a weighted-difficulty PoW protocol in which, even if an attacker’s branch advances faster than the main chain, its historical weighted difficulty remains lower than that of the main chain, preventing the malicious branch from overtaking. Sayeed [
39] proposes a penalty mechanism that increases an attacker’s cost by comparing the height of incoming blocks with the current chain length. Bae [
40] introduces random miner grouping based on hash functions and wallet addresses; after a block is mined, the group responsible for the next block is selected via the current network hash value, thereby constraining mining within specific groups and effectively reducing the risk of 51% attacks.
2.3. Block Withholding Attack
Block withholding attacks (BWH), proposed by Rosenfeld [
41] in 2011, were initially regarded as a “sabotage-without-profit” strategy. In Rosenfeld’s formulation, an attacker joins a target pool but discards any full blocks it finds there. Bag et al. later introduced a new BWH model that manipulates the allocation of mining power so that the attacker’s profitability threshold falls below that of a 51% attack. The idea is to depress the overall effective hash rate and thereby raise the attacker’s relative share to extract extra revenue. Concretely, the attacker splits its hash power into two parts: one mines honestly, while the other infiltrates the victim pool to share its public payouts but discards blocks found within that pool, reducing the pool’s income.
Laszka et al. [
42] developed a game-theoretic model to explore strategic interactions among pools, identifying conditions under which mutual non-aggression is sustainable and when attacked pools become marginalized. Kwon et al. [
43] introduced Fork-After-Withholding (FAW), which combines BWH with selfish mining: instead of discarding the withheld block, the attacker strategically releases it to create an intentional fork. By leveraging the advantages of both BWH and selfish mining, FAW can yield up to 2.5× the extra revenue of pure BWH. Gao et al. proposed the Power Adjusting (PAW) attack, which improves on FAW by dynamically reallocating hash power according to pool conditions, thereby lowering the hash-rate threshold required for a successful attack. Dong et al. [
44] presented a hybrid self-sustaining attack, and Wang et al. [
45] proposed a hybrid BWH strategy modeled via reinforcement learning to determine optimal actions across states.
To defend against BWH, multiple strategies have been proposed. Schrijvers, Bag, and Chen et al. [
46,
47,
48] advocate modifying pool reward mechanisms, while Zhou et al. [
49] suggest dynamically partitioning a pool into multiple sub-pools and placing suspicious miners into smaller groups, thereby breaking the conditions necessary for excess attacker profit.
2.4. Denial-of-Service Attack
Denial-of-service (DoS) attacks [
50] aim to exhaust a target’s resources so that it cannot provide normal service. In blockchain networks, a typical DoS variant is the eclipse attack [
51], in which an adversary controls all of a victim node’s external connections and isolates it from the rest of the network. Nayak et al. [
15] proposed a hybrid strategy that combines selfish mining with eclipse attacks, using network isolation to increase the extra revenue of selfish mining. Mirkin et al. [
52] introduced the Blockchain Denial-of-Service (BDoS) attack, which, unlike traditional network-layer DoS, exploits protocol-level incentives: by convincing others that the next block has already been found by the attacker, miners whose expected reward falls below their power cost voluntarily stop mining. Wang et al. [
53,
54] proposed SDoS, a combination of selfish mining and DoS, which depresses miners’ willingness to mine, increases the attacker’s relative hash share, and thus raises the fraction of blocks mined by the attacker.
To defend against DoS, several mechanisms have been proposed. Ilyas et al. [
55] use deep neural networks to detect DDoS attacks, and Sousa et al. [
56] apply machine learning-based detection for DoS. Raikwar et al. [
57] propose a mitigation technique based on verifiable delay functions (VDFs).
3. Attack Strategy and Theoretical Modeling
In this section, we introduce two attack methods: Bribery–Stubborn Mining (BSbM) and its advanced variant, the Triple-fork-based Leading Hidden Bribery–Stubborn Mining (LHBSbM). We first elaborate on the attack framework and theoretical foundations of the BSbM strategy. Then, we present the LHBSbM strategy, which further amplifies the attacker’s profit advantage by carefully constructing triple-fork scenarios. The theoretical models presented in this paper provide a rigorous basis for understanding the mechanisms and potential profitability of these attacks.
In this study, we selected Markov chains as our primary analytical model due to their exceptional suitability for modeling the randomness and state dependency inherent in blockchain mining processes. The entire system can be precisely described as a set of discrete states corresponding to the attacker’s lead in public chain blocks, with transitions between states being probabilistic and dependent on the distribution of miners’ computational power. The primary objective of our theoretical analysis is to determine the long-term steady-state payoffs for each participant. For this task, the Markov chain framework serves as the standard and most effective tool.
3.1. Bribery–Stubborn Mining
3.1.1. Attack Strategy
The core principle of BSbM is to first induce a blockchain fork through strategic attacks and then attach appropriate bribery fees at the fork to incentivize target miners to assist the attacker in extending the attacker’s fork. This process aims to override blocks mined by honest miners and thereby increase the attacker’s mining rewards. In the BSbM attack model, miners in the system are classified into three categories: attackers, honest miners, and target miners. Attackers are malicious mining nodes that launch the attack by employing various strategies and adjusting bribery fees to create forks and attract target miners to persistently mine on the attacker’s private chain, thus securing mining rewards disproportionate to their computational power share. Honest miners are nodes that do not accept bribes from the attacker and follow the blockchain protocol, mining on the public chain according to the first valid block they receive based on network communication. Target miners are rational nodes in the system who, during a blockchain fork, can choose whether to accept the attacker’s bribe and mine on the attacker’s chain or reject it and continue mining on the honest public chain.
The BSbM strategy is characterized by several key parameters and state variables:
L: Represents the lead in block count of the attacker’s private chain over the current longest public chain.
F: Denotes the composition of the current blockchain fork. indicates no fork; indicates a fork between the honest miners and the attacker; indicates a fork on the public chain where the honest public chain branch contains blocks mined by target miners.
R: Indicates the honest miner following rate, which represents the proportion of honest miners who choose to mine on the attacker’s chain after a fork occurs in the public chain.
The BSbM attack, for the purpose of this description, is primarily discussed in the context of competitive stubborn mining. The strategy can be adapted for lead stubborn and trail stubborn scenarios, with corresponding modifications to the attacker’s actions. At the beginning of the attack, the attacker accepts the honest public chain as both the private chain and the attacker’s chain. At this point, there is no fork in the public chain, and the block lead count is zero. Since the attacker will immediately release their block upon the honest miner publishing a new block, honest miners in the network receive both competing forks at approximately the same time. Therefore, the follow ratio
R is set to 0.5, meaning that half of the honest miners mine on the honest public chain, while the other half mine on the attacker’s chain. The specific attacker strategy is illustrated in
Table 1, where
denotes the three types of miners finding a new block, and
represents the current state of the blockchain.
B indicates that the target miner accepts the bribe and helps the attacker extend their private chain, while
denotes the opposite.
3.1.2. State Transitions and Event Modeling
Figure 1 illustrates the first two states involved in the BSbM attack, showing how the attack begins and evolves as the attacker starts to build a private chain.
State 0
At this stage, no fork exists in the public blockchain, and all miners are mining on the public chain. The green, blue, and red arrows represent the mining locations of honest miners, target miners, and the attacker, respectively. The square blocks denote the public blockchain without any forks, as shown in
Figure 1a. The events occurring under State 0 include the following:
Event 0-1: An honest miner finds and publishes a new block on the public chain, remaining in State 0.
Event 0-2: The attacker finds a new block on the public chain and adds it to the private chain, transitioning to State 1.
Event 0-3: The target miner finds and publishes a new block on the public chain, remaining in State 0.
State 1
At this stage, no fork exists on the public chain, and the attacker’s private chain leads by one block (
). The dashed circle represents the unpublished block on the private chain. The attacker mines on the private chain, while honest and target miners continue mining on the public chain, as illustrated in
Figure 1b. The events occurring under State 1 include the following:
Event 1-1: An honest miner finds and publishes a new block on the public chain; the attacker then publishes their private chain to create a fork, transitioning to State .
Event 1-2: The attacker mines a new block on the private chain, extending it, and transitions to State 2.
Event 1-3: The target miner finds and publishes a new block on the public chain; the attacker publishes their private chain to form a fork, transitioning to State .
State
In this state, a fork exists between the honest public chain and the attacker’s chain, where the honest chain consists solely of blocks mined by honest miners. The two forked chains have equal length. The diamond and solid-circle lines represent the honest chain and attacker chain, respectively. The hollow blue arrows indicate the two scenarios where the target miner either rejects or accepts the bribe. The branch length
n is a positive integer. The attacker mines on the attacker’s chain, while an
r proportion of honest miners mine on the attacker’s chain, and the remaining
proportion mine on the honest chain. The target miner chooses to mine on either chain based on the trade-off between accepting the bribe and mining profitability, as shown in
Figure 2. The events occurring under State
include the following:
Event -1: An honest miner finds and publishes a block on the honest chain, transitioning to State 0.
Event -2: An honest miner finds and publishes a block on the attacker’s chain, transitioning to State 0.
Event -3: The attacker finds a block on the attacker’s chain, extends the private chain, and transitions to State .
Event -4: The target miner finds and publishes a block on either the attacker’s chain or the honest chain, transitioning to State 0.
State
Here, the blockchain splits into two equal-length forks: the honest chain and the attacker’s chain. The honest chain contains blocks mined by the target miner, marked as triangles in the figure (these may represent any of the
n blocks on the fork, with the first shown as an example). The attacker continues mining privately; some honest miners (
r proportion) mine on the attacker’s chain, while others mine on the honest chain. The target miner mines on the honest chain, as depicted in
Figure 3. In this state, the following events may occur:
An honest miner finds and publishes a block on the honest chain, returning the system to State 0.
An honest miner finds and publishes a block on the attacker’s chain, also leading back to State 0.
The attacker mines a new block on their private chain, advancing to State .
The target miner mines and publishes a block on the honest chain, returning to State 0.
State
In this state, the public chain forks into an honest chain and an attacker’s chain, where the honest chain consists solely of blocks mined by honest miners, and the attacker’s private chain leads by one block. The attacker mines on the private chain, while an
r proportion of honest miners mine on the attacker’s chain, and the remaining
proportion mine on the honest chain. The target miner chooses to mine on either chain based on the trade-off between accepting the bribe and mining profitability, as illustrated in
Figure 4. The events occurring under State
include the following:
Event -1: An honest miner finds and publishes a block on the honest chain; the attacker then publishes their hidden private blocks, transitioning to State .
Event -2: An honest miner finds and publishes a block on the attacker’s chain; the attacker publishes their hidden private blocks, transitioning to State .
Event -3: The attacker mines a new block on the private chain, extending it, and transitions to State 2.
Event -4: The target miner finds and publishes a block on either the attacker’s chain or the honest chain; the attacker publishes their hidden private blocks, transitioning to State .
State
The blockchain currently forks into two branches: the honest chain, which includes blocks mined by the target miner, and the attacker’s private chain, which holds a one-block lead. Mining activities proceed with the attacker working on their private chain, while a fraction
r of honest miners support the attacker’s branch; the rest mine on the honest chain. The target miner mines exclusively on the honest chain, as depicted in
Figure 5.
Possible events in this state are as follows:
Event -1: A block is mined and published by an honest miner on the honest chain, prompting the attacker to reveal their private blocks and transition to State .
Event -2: A block is mined and published by an honest miner on the attacker’s chain, causing the attacker to disclose their private chain and move to State .
Event -3: The attacker mines a new block on their private chain, thereby advancing to State 2.
Event -4: The target miner finds and publishes a block on the honest chain, which leads the attacker to reveal their private blocks and return to State .
State 2
The attacker’s private chain leads the public chain by two blocks, with the fork length
m taking the values 0 or
n. Mining continues with the attacker working on their private chain, while both honest and target miners mine on the public chain. When the honest chain contains blocks mined by the target miner, the target miner mines specifically on the honest chain; otherwise, they mine on the public chain. These situations are depicted in
Figure 6.
Several events can occur under State 2:
Event 2-1: A block is mined and published by an honest miner on the public chain, prompting the attacker to reveal two private hidden blocks and revert to State 0.
Event 2-2: The attacker mines a new block on their private chain, extending it and moving the system into State 3.
Event 2-3: The target miner mines and publishes a block on the public chain, leading the attacker to reveal two private blocks and return to State 0.
State L ()
At this time, the attacker leads by
L blocks. The attacker mines on the private chain, while the target miner and honest miners mine on the public chain. If a fork exists and the honest chain contains blocks mined by the target miner, the target miner mines on the attacker’s chain. If a fork exists but the honest chain does not contain blocks mined by the target miner, the target miner mines on the public chain. These are shown in
Figure 7.
The events occurring under State L include the following:
Event L-1: An honest miner finds and publishes a block on the public chain, and the attacker reveals one hidden block from the private chain and transitions to State .
Event L-2: The attacker mines a new block on the private chain, adds it to the private chain, and transitions to State .
Event L-3: The target miner finds and publishes a block on the public chain, and the attacker reveals one hidden block from the private chain and transitions to State .
3.1.3. Theoretical Analysis
We first analyze the mining rewards of miners within the BSbM attack framework and then proceed to model and analyze the state transitions of the blockchain system using a Markov chain.
Attacker’s Reward
The attacker obtains a block reward when event 1-2,
-3,
-3, or
L-2 (
) occurs. For events 0-2 and
-3, if the target miner accepts the bribe, the attacker has a certain probability of obtaining a block reward as follows:
Here, the parameters
,
, and
o represent the mining power of the attacker, the target miner, and the honest miners, respectively. The variable
r denotes the proportion of honest miners who follow the attacker’s chain. If the target miner rejects the bribe, the probability that the attacker obtains a block reward is given by
When event
-3 occurs, the attacker has a probability of
of obtaining a block reward.
Target Miner’s Reward
The target miner obtains a block reward when event 0-3,
-4, or
-4 occurs. For events 1-3,
-4, and
-4, the target miner has the following probability of obtaining a block reward:
Honest Miners’ Reward
Honest miners obtain a block reward when event 0-1,
-1,
-2,
-1, or
-2 occurs. For events 1-1,
-1,
-2, and
-2, if the target miner accepts the bribe, the honest miners have the following probability of obtaining a block reward:
If the target miner rejects the bribe, the probability that honest miners obtain a block reward is given by
When event
-1 occurs, honest miners have a probability of
of obtaining a block reward.
The Markov chain model representing the state transitions of the blockchain system under the BSbM attack is illustrated in
Figure 8, where each state corresponds to the blockchain system states described in the previous section.
In the competitive BSbM mining model,
(
, where
L is an integer) denotes the probability of states 0 to
L, and
(with
or 1, and
or
b) represents the probability of states
,
,
, and
. The state probabilities of the Markov model depicted in
Figure 8 are given as follows.
Further calculations yield the explicit formulas for the state probabilities as follows.
Based on the aforementioned model, the miners’ rewards can be analyzed to determine the target miner’s optimal mining strategy and the attacker’s range of possible rewards. The following conclusions can be drawn:
Conclusion 1. When the attacker launches the BSbM attack, the target miner achieves higher mining rewards by accepting the attacker’s bribe and mining on the attacker’s chain during a blockchain fork, compared to rejecting the bribe and mining on the honest public chain. In other words, choosing to accept the bribe and assist the attacker in mining during a fork is the optimal mining strategy for the target miner.
Proof. If the target miner chooses to accept the attacker’s bribe and assist in mining, the target miner’s reward
is given by
If the target miner chooses to reject the attacker’s bribe, the reward
is
Combining the above equations, we have
Since , where is the bribery payment from the attacker, it follows that . □
Conclusion 2. In the BSbM attack, there exists an appropriate bribery factor such that the attacker achieves higher mining rewards when the target miner accepts the bribe compared to rejecting it. Moreover, the attacker’s maximum reward is attained when .
Proof. If the target miner chooses to accept the attacker’s bribe and assist in mining, the attacker’s reward
is given by
Considering the bribery payment
the attacker must pay, the attacker’s total reward
is
If the target miner chooses to reject the attacker’s bribe, the attacker’s reward
is
Combining the above equations, we have
Since
, it follows that
which leads to the critical condition:
The maximum reward when is . □
3.2. Triple-Fork-Based Leading Hidden BSbM
In Proof-of-Work-based blockchain systems, miners’ rewards are determined by their effective block occupancy rate. Attacks based on selfish mining principles, including BSbM, increase the attacker’s effective block occupancy by first creating a fork and then discarding blocks on the public chain. However, this attack approach has limitations when considering fork structures: under the same block height, it can only discard one block, resulting in a maximum effective block occupancy of for the attacker. In actual mining, the blockchain may experience three forks, which potentially allow the discarding of more blocks at the same block height, thus increasing the attacker’s effective block occupancy. Therefore, this paper further proposes a Triple-fork-based Leading Hidden BSbM (LHBSbM) attack, which creates a triple fork to extend the attacker’s private chain lead time, thereby increasing the attacker’s block occupancy and enhancing mining rewards.
3.2.1. Attack Strategy
When the blockchain system is in State L (), if an honest miner successfully mines a new block, the attacker immediately publishes one block from their private chain. This attack strategy causes the attacker’s chain and the honest public chain to form a fork of equal length, prompting miners to extend one of the two public chain branches. In the LHBSbM attack strategy, this state is referred to as the leading state, where the attacker does not reveal private chain blocks upon honest miners’ successful mining, but instead enters a hidden state. In the hidden state, if the next new block is mined by the target miner on the attacker’s chain, the blockchain system forms a triple fork. Under this triple-fork state, the attacker’s private chain can discard two blocks at the same block height, thereby increasing the attacker’s effective block occupancy.
To further improve the attacker’s effective block occupancy, this paper proposes the LHBSbM (Leading Hidden BSbM) attack strategy based on a triple fork. When in the initial state or L < 2, the LHBSbM strategy follows the same logic as BSbM. The complete strategy is detailed in
Table 2.
3.2.2. State Transitions and Event Modeling
The blockchain states and events under the LHBSbM attack are described as follows.
State L ()
As shown in
Figure 9, when the mining time is
N, the blockchain is in a
leading state. The attacker’s private chain leads both the honest chain and the attacker’s public fork by
L blocks. The attacker mines on the private chain, the target miner mines on the attacker’s fork, and honest miners split their mining power: a fraction
r mines on the attacker’s fork, and the remaining
mines on the honest chain.
If Event A occurs, at time , the blockchain enters a hidden state. In this state, the attacker’s private chain leads the honest chain and the attacker’s fork by and L blocks, respectively. The attacker continues to mine on the private chain. The target miner chooses to mine on either the attacker’s fork or the honest chain based on whether the bribe is accepted and the comparative expected reward. Honest miners mine on the honest chain.
If Event B occurs, at time , the blockchain enters a triple-fork state. Here, the attacker’s private chain leads both the honest chain and the attacker’s fork by blocks. The attacker continues to mine on the private chain, while the target miner mines on the attacker’s fork and honest miners mine on the honest chain.
Event A: An honest miner finds and publishes a block on the honest chain. The attacker then hides the private chain, transitioning the system to the hidden state.
Event B: The target miner accepts the bribe from the attacker and mines a block on the attacker’s fork, resulting in a triple-fork state.
In the triple-fork state, the worst-case scenario is that the attacker immediately publishes the private chain to eliminate two blocks (eliminating at least one more block than BSbM). The ideal scenario is shown in
Figure 10, where Event A and Event B occur alternately. The private chain leads the honest public chain and the attacker chain by one and two blocks, respectively. When the attacker publishes the private chain, the number of blocks that can be eliminated is approximately 2
L (doubling the number of eliminated blocks compared to BSbM).
3.2.3. Theoretical Analysis
In the Bitcoin system, a new block is generated approximately every 10 min on average. Under the assumption of constant mining power, the mining time is proportional to the probability of successfully mining a block. Therefore, the following conclusion can be drawn regarding the private chain lead time in the LHBSbM attack:
Conclusion 1. Compared to the classical double-fork attack, the LHBSbM attack achieves a longer private chain lead time, meaning the attacker has more time to mine on the private chain. This implies that under equal mining power, the attacker has a greater chance to extend the private chain and enlarge the lead, thereby increasing mining revenue.
Proof. In the LHBSbM attack, when the system state is greater than 2 and a triple fork is successfully created, the attacker can override up to two public chain blocks with each private chain block released—one mined by honest miners and the other by target miners. When publishing N private chain blocks, the attacker can maintain a lead time equivalent to block generation intervals. If the triple-fork construction fails, the LHBSbM attack degenerates to the ordinary double-fork scenario, where the minimum lead time for N private blocks is block intervals (the same as the classical double-fork attack). In other states, the attacker follows the same strategy as in the double-fork attack, maintaining the same lead time. Therefore, the total private chain lead time generated in LHBSbM is longer. □
Conclusion 2. Compared with the maximum effective block occupancy rate under the double-fork scenario, the LHBSbM attack achieves a higher maximum effective block occupancy rate. If the attacker’s mining power exceeds one-third of the total system power, the maximum effective block occupancy rate reaches 1. If the attacker’s mining power is less than one-third, then when the target miner’s power exceeds that of the honest miners, the maximum effective block occupancy rate is ; otherwise, it is .
Proof. Let
,
, and
denote the numbers of blocks mined by the attacker, target miners, and honest miners, respectively, during time
T. The
blocks mined by the attacker can cover two public chains of lengths
and
at most. Let
and
be the numbers of target miner and honest miner blocks not covered by the attacker, respectively. Then we have
When the uncovered blocks mined by target miners and honest miners overlap, if
, the minimal effective block count after coverage is
; if
, it is
. Thus, the attacker’s effective block rate
is
According to the law of large numbers, we have
If the attacker’s mining power
exceeds one-third of the total system power, and both target and honest miners have less than one-third each, then
and
. Setting
and
yields
When the attacker’s mining power is less than one-third, combining the above gives
Considering that the target miner’s mining power is usually less than that of honest miners in practice and that, when , 1 as the limit does not effectively reflect the positive correlation between mining power and reward, we therefore use to represent the maximum effective block occupancy rate of the attacker. □
4. Simulation and Evaluation
To evaluate the effectiveness of the proposed BSbM and LHBSbM attack strategies, we conducted simulation experiments that emulated the block generation and fork resolution process in a Proof-of-Work blockchain. The simulation considered three types of miners: the attacker (hash power ), the target miner (hash power ), and honest miners (hash power ). All experiments were conducted on a physical machine running Ubuntu 23.10. Using Python 3.10, we simulated a blockchain with 1,000,000 blocks on a PC equipped with an Intel Xeon W-2275 @ 3.30 GHz CPU and 125 GB RAM for experimentation and analysis.
4.1. Evaluation of BSbM Attack
This subsection presents the simulation analysis of the Bribery–Stubborn Mining (BSbM) attack. In these experiments, the target miner’s hash power is set to 0.1, and the bribery factor is 0.02. Mining revenues for all miners are normalized for comparison.
4.1.1. Target Miner’s Revenue
We first analyze the optimal mining strategy for the target miner when a fork occurs in the BSbM attack.
Figure 11 shows the percentage of additional mining revenue that the target miner gains by accepting the attacker’s bribe compared to refusing it. The simulation results align with the theoretical analysis, confirming that accepting the bribe and mining on the attacker’s chain always provides higher revenue for the target miner. This additional revenue increases as the attacker’s hash power
grows due to the larger rewards and bribes that the attacker can afford.
Figure 12 further details the target miner’s extra revenue under different honest miner follow rates (
), comparing bribe acceptance (solid lines) and refusal (dashed lines). The results consistently demonstrate that accepting the bribe is more profitable. A higher
r leads to reduced gains for the target miner. Notably, there exists a critical point below which accepting the bribe may still result in a net loss, but less than the loss from refusing it, thus confirming that accepting the bribe remains the optimal strategy.
4.1.2. Attacker’s Revenue
Next, we evaluate the attacker’s revenue under the BSbM strategy.
Figure 13 illustrates the attacker’s additional revenue compared to honest mining when the target miner refuses the bribe. Even without cooperation, an attacker with sufficient hash power
and a favorable follow ratio
r can still outperform honest mining.
Conversely,
Figure 14 shows the attacker’s extra revenue when the target miner accepts the bribe. In this case, the required threshold for
to make the attack profitable is lower. For instance, with
, profitability begins when
.
Figure 15 compares attack revenue with and without bribery for
under different
r values. The attacker consistently benefits from offering bribes (solid lines above dashed). However, as
increases, the marginal benefit of bribery decreases. When the attacker’s hash power is large, the bribe may become unnecessary.
4.2. Evaluation of LHBSbM Optimization
This subsection evaluates the Triple-fork-based Leading Hidden BSbM (LHBSbM) attack, which is an optimization of BSbM. LHBSbM aims to further increase the attacker’s effective block share and private chain mining lead time by creating a blockchain triple fork.
4.2.1. Comparison of Discarded Blocks (Extra Mining Time)
The first experiment compares the average number of discarded blocks (which translates to extra mining time for the attacker) generated by LHBSbM versus the competitive BSbM strategy. For this simulation, the attacker’s hash power is set to 0.4, the target miner’s hash power to 0.1, and the bribery factor to 0.02. The analysis focuses on scenarios where the initial state is greater than or equal to 2, as these are conditions under which LHBSbM can initiate a triple fork. Each data point is an average over 1,000,000 mining cycles, measuring the extra mining time gained from the initial state until the system returns to State 0.
Figure 16 presents the results, where the red line denotes LHBSbM and the blue line denotes competitive BSbM. Across all simulated initial states (from 2 to 48), LHBSbM consistently results in a higher average number of discarded blocks than BSbM. Furthermore, the difference in the average number of discarded blocks between the two strategies widens as the initial state (attacker’s lead) increases, indicating that LHBSbM’s advantage in gaining extra mining time becomes more pronounced with a larger initial lead.
4.2.2. Optimized Attack Revenue
The second experiment is designed to compare the attack revenue of LHBSbM against BSbM more directly. In this setup, the attacker employs the LHBSbM strategy when their private chain lead is greater than two blocks and reverts to the BSbM strategy in other states. It is assumed that rational target miners will only assist the attacker if the attacker’s hash power exceeds 0.35; otherwise, they behave as honest miners.
Figure 17 shows the attack revenue, where E-LHBSbM (solid lines) represents the revenue from BSbM optimized with the LHBSbM strategy, and E-BSbM (dashed lines) represents the revenue from the original BSbM attack, for
r values of 0, 0.5, and 1. The results clearly indicate that the BSbM attack, when optimized with the LHBSbM triple-fork strategy, yields higher revenue for the attacker compared to the original BSbM attack across the tested
r values.
4.3. Summary
The simulation results validate the theoretical analyses of both the BSbM and LHBSbM attack strategies. For the BSbM attack, it is demonstrated that accepting the attacker’s bribe is the optimal strategy for a rational target miner, providing them with higher potential revenue compared to refusing the bribe or honest mining. The attacker, in turn, can achieve greater revenue than honest mining by employing BSbM, with the success threshold being lower when the target miner cooperates.
The LHBSbM optimization further enhances the attacker’s capabilities. By creating a triple fork, LHBSbM allows the attacker to secure more extra mining time (by orphaning more blocks) and achieve higher overall attack revenue compared to the standard BSbM approach. These findings underscore the increased risks posed by such advanced selfish mining variants to Proof-of-Work blockchains.
5. Discussion
This part reviews our work, discusses potential defenses against such attacks, and outlines directions for future research.
5.1. Review of Our Work
This study investigates symmetry-breaking attacks in Proof-of-Work (PoW) blockchains perpetrated by rational miners and proposes two new and more sophisticated mining attack strategies. First, we introduce Bribery–Stubborn Mining (BSbM), which innovatively combines economic bribery with the nonconceding tactics of stubborn mining to incentivize targeted miners to collaborate during forks. Compared with bribery–selfish mining, BSbM leverages stubborn mining’s nonconceding behavior to prolong fork competition, thereby making targeted bribes more effective.
Building on this, and considering the threat posed by more complex fork scenarios, we design Leading Hidden Bribery–Stubborn Mining (LHBSbM). LHBSbM uses concealed leading actions to construct a triple fork, more efficiently orphaning blocks mined by honest miners and bribed miners. Relative to BSbM, LHBSbM increases the attacker’s payoff ceiling by raising the honest block orphaning rate and extending the lead of the attacker’s private chain.
5.2. Defensive Measures
The purpose of studying these attacks is to anticipate possible adversarial behaviors and develop stronger defenses to protect the network. For hybrid attacks like BSbM and LHBSbM that combine economic incentives with protocol manipulation, potential countermeasures can be explored at the protocol level, the network level, and within economic and game-theoretic models.
Protocol-Level Modifications
Incentive restructuring: Introduce penalty mechanisms that detect anomalous timestamps and penalize delayed block publication, increasing the cost of stubborn withholding.
Weighting optimization: Develop time-based weighting frameworks so the protocol deprioritizes blocks that appear to have been strategically withheld.
Network-Level Monitoring
Anomaly detection: Deploy monitoring that flags abnormal activity by analyzing indicators such as transaction volume, serial numbers, and mining cost.
Pool behavior identification: Detect dishonest pools by checking whether the claimed previous-block hash has already been published on-chain.
Countermeasures for stealthy variants: For attacks that suppress orphan blocks to evade detection, incorporate deeper behavioral pattern analysis.
Economic and Game Theory Countermeasures
5.3. Potential Directions for Future Research
Drawing on prior work and current AI techniques, we outline several promising directions for advancing blockchain security:
AI for strategy discovery and dynamic defense: Use generative AI (e.g., ChatGPT [
58], DeepSeek [
59], Google Gemini [
60]) to propose priors for composite attack and defense strategies; build AI-based dynamic defense systems that analyze miner behavior and fork signals in real time and recommend parameter adjustments such as confirmation depth, tie-breaking rules, and timestamp penalties.
Multi-attacker modeling: Extend the model from a single attacker to environments with multiple competing or collaborating attackers; use evolutionary and Markov games to characterize bribery target selection, bidding, and payoff allocation equilibria and analyze their effects on stability and thresholds; and integrate contract-based anti-bribery and study incentive compatibility and anti-collusion conditions.
Model refinement with decision processes: Employ Markov decision processes to model bribery-based selfish mining, analyze dynamic rewards, and optimize attacker behavior.
Extended multi-attacker equilibria:Further generalize to competitive or cooperative multi-attacker settings using evolutionary, stochastic, and Markov games to characterize equilibria for bribery target selection and payoff allocation and to analyze thresholds, stability, and convergence.
6. Conclusions
This paper investigated strategic attacks on PoW blockchains and introduced two hybrid mining strategies: Bribery–Stubborn Mining (BSbM) and its advanced variant, Leading Hidden Bribery–Stubborn Mining (LHBSbM).
BSbM combines economic bribery with stubborn mining behavior to create an incentive-compatible setting in which a rational target miner optimally accepts a bribe and assists the attacker during forks, increasing the likelihood of disproportionate rewards relative to the attacker’s hash share. Building on this, LHBSbM orchestrates carefully timed delayed publication to construct triple-fork states; our analysis and simulations show that it can simultaneously orphan multiple honest blocks, significantly extend the lead of the attacker’s private chain, and yield higher overall revenue than traditional double-fork attacks.
These results highlight the growing sophistication of mining attacks and underscore that PoW consensus remains vulnerable when economic incentives and protocol-level tactics interact. These results underscore the need for defenses at the protocol, network, and pool levels and lay the groundwork for strengthening PoW systems against hybrid bribery and stubborn attacks.