1. Introduction
The rise of Advanced Persistent Threats (APTs) and zero-day vulnerabilities in cyberspace [
1] has rendered the traditional passive defense paradigm—for boundary isolation, we define a multi-dimensional attacker model with signature detection—increasingly ineffective against the evolving landscape of cyberattacks. Dynamic heterogeneous redundancy (DHR), a core technology for active defense, enhances vulnerability tolerance by ‘creating heterogeneous execution environments’ and employing ‘redundant result comparison’. However, traditional DHR architectures depend on a centralized control module, which is a critical vulnerability in security defense. Centralized DHR systems are susceptible to two major risks: first, single-point failures—where the control node directly disables the entire security decision-making process; second, trust bottlenecks—where a malicious control node could manipulate isolation instructions or alter response comparison results, undermining data integrity. This flaw significantly limits the large-scale deployment of DHR in security-critical domains such as financial transactions and industrial control systems.
To overcome the limitations of centralization, distributed consensus algorithms have emerged as a key technical solution. The Raft consensus algorithm, which focuses on ‘majority node consensus verification’, offers clear crash fault tolerance (CFT) capabilities (with the condition f < N/2, where N is the total number of nodes in the cluster), ease of engineering implementation, and low-latency characteristics. Raft has demonstrated its industrial-grade utility in distributed systems, such as that presented by Kubernetes [
2]. This paper proposes a decentralized DHR architecture with a Raft consensus cluster as the control core, integrated with a DHR execution layer, to establish a defense paradigm based on ‘consensus-driven security decision-making and heterogeneity-based vulnerability tolerance.’ In this system, the leader node proposes logs (e.g., business requests, executor isolation instructions), while follower nodes participate in consensus voting to ensure uniform agreement across the system state. Additionally, the system leverages a set of heterogeneous execution components—service instances with varying operating systems, libraries, and algorithm implementations—to protect against attacks like the ‘mass exploitation of common vulnerabilities’.
However, there are still notable gaps in the existing research: on one hand, few studies have deeply integrated the Raft consensus with the DHR architecture, and there is a lack of systematic designs for the collaboration logic of the “input proxy layer–consensus layer–execution layer–voting layer” network. For instance, key mechanisms such as the “distributed consensus submission of response logs (ResponseLogEntry)” mentioned in this paper have not yet formed standardized solutions; on the other hand, the formal description of attacker models for such decentralized architectures is insufficient, and there is a lack of quantitative trade-off analysis between security properties (consistency, availability) and performance overheads, making it difficult to clarify their defense boundaries and engineering implementation feasibility.
This study systematically constructs a decentralized defense framework that integrates “Raft consensus with heterogeneous redundancy”, addressing key limitations in the existing integration of distributed consensus algorithms and active defense technologies. By formally defining a multi-dimensional adversary model—encompassing external attackers, internal executor attackers, internal node attackers, and collaborative Byzantine attackers—the work establishes a consistency guarantee mechanism under the condition f < N/2, thereby enhancing the theoretical foundation for security analysis in decentralized DHR architectures. Additionally, mechanisms such as “decentralized log-based voting” and “time-triggered Leader rotation” introduced in this paper offer a new technical paradigm for incorporating dynamism and randomness into active defense systems.
Against the single-point failure, trust bottlenecks of traditional DHR architectures and existing gaps (insufficient Raft–DHR integration, lack of standardized consensus-driven security mechanisms), this study makes three key contributions, which are described here.
We design an optimized hierarchical architecture integrating Raft with DHR: deploying “leader routing” in the input proxy layer for efficient request forwarding, adapting “dual logs (BusinessLogEntry/ResponseLogEntry)” in the Raft layer to control security-critical decisions via global consistency, and adding a random selection mechanism in the heterogeneous execution layer; accordingly, we are able to address the centralized limitation of traditional DHRs.
We build an atomic anomaly handling mechanism based on the “anomaly detection–isolation proposal–collaborative isolation–service recovery” flow. Isolation instructions take effect only after majority consensus, with an “anomaly evidence chain” (ID, deviated hash, timestamp) for traceability—this solves the untrustworthy isolation and inconsistent states in traditional DHRs.
We develop a prototype system with five heterogeneous nodes and conduct comprehensive security performance tests: in the security tests, we simulated four attack scenarios (external, internal executor/node, collusive Byzantine) to quantify the detection rate, availability, and consistency; performance tests were used to measure throughput, latency, and resource consumption to clarify overhead boundaries—this provides data support for engineering applications.
The remainder of this paper is structured as follows:
Section 2 reviews related work on DHR and Raft consensus;
Section 3 presents the core components of the decentralized DHR architecture, including the normal business request processing flow, anomaly detection and isolation mechanisms, and proactive security strategies;
Section 4 formally defines the attacker tuple and describes the behavioral characteristics of the four types of attackers by category;
Section 5 systematically analyzes the architecture’s guarantee mechanisms in terms of consistency, availability, integrity, and confidentiality, and demonstrates its defense boundaries;
Section 6 introduces the experimental environment configuration, designs four attack simulation schemes, and analyzes core indicators such as attack detection rate and system availability;
Section 7 tests the architecture’s throughput, latency, and resource consumption under normal and attack scenarios to quantify the performance overhead;
Section 8 summarizes the research results.
Limitations and scope of Raft security: It is important to emphasize that the underlying consensus layer in our architecture remains Raft, which is a crash-fault-tolerant (CFT) protocol, rather than being a Byzantine-fault-tolerant (BFT) one. Our design does not attempt to convert Raft into a full BFT protocol. Instead, we layer additional mechanisms on top of Raft—such as cryptographic validation of client requests and response logs, majority voting in the heterogeneous execution layer, and evidence-based isolation—to constrain the damage that misbehaving nodes can cause when the majority of nodes are still honest. Throughout the paper, we therefore distinguish between formally guaranteed properties under Raft’s CFT fault model and empirical stress test results under collaborative Byzantine-like behaviors, and we explicitly characterize the corresponding defense boundary.
Security scope and artifact availability: It is important to emphasize that our architecture does not modify Raft into a Byzantine-fault-tolerant protocol. Raft remains a crash-fault-tolerant (CFT) consensus algorithm, and its guarantees a break under arbitrary Byzantine behaviors such as equivocation, forged votes, or divergent log replication. Our contribution is to layer a heterogeneous redundant execution and evidence–carrying response–log pipeline on top of Raft, so that the system can maintain correct outputs and auditable isolation decisions even when some executors or nodes behave maliciously, while preserving Raft’s CFT safety at the log level. To support transparency and evaluation, we provide a runnable Docker-based prototype artifact that enables reviewers to observe and evaluate representative architectural behaviors of the proposed system under the controlled configurations.
4. Adversary Model
To formally analyze the security properties of the system, this Section models potential adversaries based on the Dolev–Yao model and the Byzantine Fault Tolerance model. We consider the adversary’s position in the system, capabilities, acquired knowledge, and attack objectives to define the types and behaviors of adversaries.
4.1. Adversary Capability Model
We define the adversary’s capabilities using the following tuple [
24]:
where
(location) denotes the adversary’s position in the system, including the external network, internal nodes, etc.
(knowledge) represents the knowledge possessed by the adversary, including system architecture, protocol details, etc.
(capabilities) refers to the adversary’s capabilities, such as tampering with messages, delaying responses, etc.
(objectives) indicates the adversary’s goals, such as disrupting consistency, stealing data, etc.
According to the adversary’s position and capabilities, we classify adversaries into the following four types:
External adversary: The adversary is located outside the system and can only interact with the system through public interfaces. Its capabilities include eavesdropping, tampering with, or injecting network messages, but it cannot directly access the system’s internal state.
Internal component adversary: The adversary has compromised one or more heterogeneous executors (Ax) and can arbitrarily tamper with the output and behavior of these executors. However, it cannot break through container isolation to access other executors or Raft nodes.
Internal node adversary: The adversary has taken control of a Raft node (Fi), including the heterogeneous executors on that node. It can arbitrarily manipulate the state of the node and send fake Raft messages but cannot directly control other nodes.
Collaborative Byzantine adversary: The adversary controls multiple Raft nodes simultaneously (no more than f nodes, where f < N/2, and N is the number of Raft nodes). These nodes can act collaboratively to launch Byzantine attacks, including behaviors that arbitrarily deviate from the protocol.
4.2. Assumptions About Adversary Knowledge
We assume that the adversary may possess the following knowledge: The system’s architectural design, including the existence of the Raft cluster and heterogeneous executors; the basic principles and communication mechanisms of the Raft consensus protocol; the heterogeneous configuration of some executors (e.g., operating systems, software versions); the specifications of cryptographic primitives (e.g., hash functions, digital signatures).
At the same time, we assume that the adversary does not possess the following knowledge: The private keys or long-term credentials of other nodes; complete details of the heterogeneous configuration of all executors; the system’s real-time status (e.g., the current leader node, health status), unless obtained through an attack.
4.3. Formalization of Attack Objectives
The adversary’s objectives can be formalized as one or more of the following:
Disrupting availability: Rendering the system unable to respond to client requests normally through denial-of-service (DoS) attacks.
Disrupting consistency: Causing the system state to diverge, where different nodes hold different states.
Disrupting integrity: Tampering with the system state or business data to cause unauthorized state changes.
Disrupting confidentiality: Stealing sensitive internal system data or business data.
Persistence control: Planting backdoors in the system to maintain long-term control.
4.4. Formalization of Adversary Behaviors
We use a state transition model to describe the adversary’s behaviors. Let the system state be S and the set of adversary actions be A; then the adversary’s behavior can be expressed as:
where the adversary’s actions
include but are not limited to those shown in
Table 2.
For internal node adversaries, their actions also include those shown in
Table 3.
4.5. Adversary Profiles
Based on the above model, we define two typical adversary profiles:
Profile 1: malicious executor adversary—shown in
Table 4.
Profile 2: Malicious Raft node adversary—shown in
Table 5.
It is worth noting that this adversary model is intentionally more general than the failure model natively handled by Raft. While we model collaborative Byzantine adversaries for completeness, the underlying consensus protocol remains CFT. Consequently, in the subsequent security analysis we carefully separate what can be provably guaranteed under Raft’s crash-fault assumptions from what is only empirically observed under stronger Byzantine-style behaviors in our prototype.
4.6. Timeout Handling and Benign Performance Variations
In practice, response timeouts alone are not sufficient to classify an executor as malicious. The system first establishes baseline latency distributions under benign load and chooses timeout thresholds based on the P95–P99 percentiles of the observed latency. Occasional outliers beyond this threshold are treated as performance anomalies but not immediately as malicious behavior. Only persistent deviations across multiple independent requests—e.g., a statistically significant shift in the latency distribution or repeated timeouts for the same executor—will promote an executor from “suspected” to “malicious”, at which point an isolation proposal is issued. This two-stage process reduces false positives under bursty or high-load conditions while still allowing the system to isolate truly compromised executors.
All comparisons in our experiments use the normalization function
defined in
Section 3.3.5 so that benign formatting or rounding differences do not trigger false-positive anomalies.
6. Security Verification Experiments
To verify the security of the architecture against various attacks, this section designs experiments to validate the architecture’s defense capabilities against malicious executors, malicious nodes, and collaborative Byzantine adversaries.
6.1. Experimental Environment and Configuration
This paper built a cluster consisting of 5 Raft nodes, where each Raft node runs on an independent virtual machine. Each node is deployed with corresponding heterogeneous executors, which differ significantly in terms of operating systems, dependency library versions, and business logic implementations. The node configuration is shown in
Table 6.
We emphasize that the purpose of the experimental evaluation is to demonstrate architectural feasibility, consistency trends, and security-relevant behaviors under adversarial conditions, rather than to provide source-level reproducibility or serve as a reference implementation for low-level performance benchmarking. The experiments are designed to validate architectural mechanisms within the stated system assumptions.
In our prototype, each Raft node runs a JRaft-based Java control service together with a language-specific HTTP agent (Java/Spring Boot, Python/Flask, Go/Gin, Node.js/Express, PHP/ThinkPHP). The business logic consists of a simple account service with two operations: GET (read-only balance query) and INCREMENT/TRANSFER (state-changing update), both of which are encoded as BusinessLogEntry instances. For INCREMENT/TRANSFER, the consensus-committed new state is further processed by k heterogeneous executors, and their outputs are recorded as ResponseLogEntry instances in the dual-log pipeline. A Docker-based artifact, including deployment scripts and pre-built components, is provided to enable reviewers to deploy and observe a representative 5-node heterogeneous cluster using a single docker-compose command.
All experiments use JMeter as the load generator with:
Request rate: 100 QPS;
Experiment duration: 10 min;
Repeated trials: 30 runs per experiment;
Metrics collected: throughput (RPS), average latency, P95/P99 tail latency, executor divergence frequency, Raft commit latency, anomaly-detection latency.
All runs use the same docker compose setup to ensure consistency.
6.2. Experimental Design and Attack Simulation
Targeting the four types of adversary models, we designed four core experiments. Each experiment lasted 10 min, during which a load generator sent standard business requests at a rate of 100 QPS (Queries Per Second). Each experiment is executed 30 times under identical configuration. We report mean values across runs and include standard deviations; for latency-related metrics we additionally report P95 and P99 values to capture tail behavior.
6.2.1. External Adversary Simulation
Network attack experiment: Using the Scapy tool outside the cluster, we construct and send the following malicious traffic:
Replay attack: Capture legitimate requests and replay them immediately.
Message tampering: Modify key parameters in requests (e.g., user ID, transaction amount).
DDoS flooding: Send a large number of SYN packets to all node ports.
Verification target: Determine whether the system can ensure the non-reproducibility and integrity of requests and maintain service availability.
Expected results: The mechanism based on Raft log unique IDs and client signatures should reject all replay and tampered requests; cluster load balancing should mitigate the impact of DDoS.
6.2.2. Internal Component Adversary Simulation: Malicious Executor Experiment
Randomly select one executor (e.g., the Go executor on Node 3) and inject the following malicious behaviors during its request processing:
Deterministic error return: Always return fixed error results for specific requests.
Random error return: Return randomly tampered results with a 30% probability.
Response delay: Return responses after a random delay of 100–500 ms to simulate slow-rate attacks.
Verification target: Whether the majority voting mechanism can identify abnormal responses and trigger the isolation process for the malicious executor.
Expected results: The system should derive correct results through majority responses and generate valid isolation logs for the malicious executor.
6.2.3. Internal Node Adversary Simulation: Malicious Raft Node Experiment
Select Node 2 as the malicious node and simulate two attack scenarios:
Malicious follower: Discard 50% of the AppendEntries RPCs from the leader.
Malicious leader: Submit illegal isolation instructions to attempt to isolate a healthy executor (e.g., the Java executor on Node 1); submit malicious business logs containing an illegal request without client signature.
Verification target: Determine whether follower nodes can verify and reject illegal proposals, and whether the system can dismiss the malicious leader through re-election.
Expected results: Illegal isolation instructions will be rejected by the majority of nodes due to insufficient evidence; malicious business logs will be rejected due to signature verification failure; the abnormal behavior of Node 2 will lead to the early termination of its term.
6.2.4. Collaborative Byzantine Adversary Simulation: Byzantine Attack Experiment
Control Node 4 and Node 5 simultaneously (f = 2) to make them perform collaborative Byzantine behaviors.
Contradictory voting: Cast affirmative votes for different candidates during elections.
Log forking: Send log entries with different sequences to different followers.
Selective response: Process read requests normally but not respond to any written requests.
Verification Target: When f < N/2, determine whether the system can ensure security (state consistency, no illegal log commitment) and maintain limited availability.
Expected results: In the configured stress test, where at most two nodes behave in a Byzantine-like manner, we expect that the three non-malicious nodes (Node 1, Node 2, Node 3) will maintain fully consistent logs and state machines. Because a majority of 3 votes cannot be reached for new write entries, written requests are likely to be temporarily blocked, while read requests served from committed state can still succeed at a high rate. In our experiments we did not observe any incorrect log commitments, but we emphasize that this is an empirical result for this restricted adversary configuration rather than a general BFT guarantee. The security verification metrics and evaluation criteria are summarized in
Table 7.
6.3. Experimental Results and Analysis
We used attack detection rate, system availability (proportion of successfully responded requests), and state consistency (whether the state machine hash values of all non-malicious nodes are the same) as core evaluation indicators, standard deviations are within 3–5% of the mean.
Consistency guarantee: In all four types of attack experiments, the state machines of all non-malicious nodes finally remained consistent, verifying the effectiveness of Raft logs as the “single source of truth”.
Anomaly detection and isolation: Experiments 2 and 3 showed that the majority voting and evidence verification mechanisms based on consensus can detect and isolate internal threats in near real time.
Availability trade-off: Experiment 4 showed that when facing collaborative Byzantine attacks, the system prioritizes ensuring security (consistency) while sacrificing partial availability (availability), which is consistent with the expectations of the CAP theorem.
6.4. Experimental Conclusions
The experiments in this Section comprehensively verified the security defense capabilities of the proposed architecture. The results show that the architecture can effectively resist various attack modes, from external networks to internal nodes, and from non-collaborative to collaborative attacks. The Raft consensus layer is the core of defense, ensuring the global consistency and tamper-proof of all security decisions. Heterogeneous executors and the majority voting mechanism together form a “security filter” for the execution layer, which can effectively filter malicious behaviors of single or a few points. Under extreme Byzantine faults, the architecture adheres to the security bottom line and ensures the eventual consistency of the system.
6.5. Discussion of Additional Adversarial Stress Tests
Although we did not implement full-scale adversarial experiments for extended attack scenarios in the current prototype, we provide a conceptual analysis of how the proposed architecture behaves under four representative adversarial patterns commonly encountered in distributed systems: partial message withholding, forced leader churn, timestamp manipulation, and multi-partition split-brain scenarios.
6.5.1. Partial Message Withholding
If a follower selectively drops AppendEntries messages, Raft preserves safety because log entries cannot be committed without a majority. Written availability may degrade, but honest nodes never diverge. The DHR layer continues to validate executor outputs as long as at least one honest executor remains active.
6.5.2. Forced Leader Churn
Frequent disruptions of the leader—with the intention of destabilizing election cycles—reduce system liveness but do not affect Raft’s safety properties. The system delays committed operations but prevents conflicting logs from being committed. This behavior illustrates Raft’s known trade-off: sacrificing availability to preserve safety under unstable leadership.
6.5.3. Timestamp Manipulation
Because Raft orders logs strictly by (term, index), client-side or executor-side timestamp tampering cannot influence log ordering. The DHR layer attaches evidence hashes and signatures to response logs, preventing malicious reordering or rollback attacks.
6.5.4. Split-Brain Scenarios
Under multi-partition network splits, minority partitions lose the ability to commit writes but retain read availability if executor results are locally verifiable. Once connectivity is restored, logs are reconciled consistent with Raft’s majority-based safety rules. This demonstrates the defense boundary of the architecture: safety is preserved, but availability degrades under severe partitions.
A full empirical evaluation of these stress tests is left for future work and requires a more extensive fault-injection framework. The conceptual analysis provided here complements the formal security model and clarifies the behavior of the architecture under adversarial stress conditions beyond the crash-fault assumption.
8. Conclusions
To address the core challenges of single-point failure and trust bottleneck in the centralized control of dynamic heterogeneous redundancy (DHR) architectures, this paper is the first to systematically propose and implement a decentralized DHR architecture based on Raft consensus. The main conclusions and contributions of this study are summarized in four points, laid out below.
First, we designed and implemented a decentralized control plane that deeply couples the Raft consensus mechanism with the DHR execution layer. By introducing a dual-log pipeline consisting of BusinessLogEntry (business request log) and ResponseLogEntry (response log), this architecture places all security-critical decisions—including client request distribution, heterogeneous executor scheduling, multi-mode response arbitration, and abnormal executor isolation—under the strict sequential management of the Raft state machine. This design fundamentally eliminates the security risks of traditional DHR relying on a single control node, shifting the system’s trust foundation from a single entity to a consensus mechanism maintained by a majority of nodes.
Second, by formally defining a multi-dimensional adversary model that includes external adversaries, internal executor adversaries, internal node adversaries, and collaborative Byzantine-style adversaries, we systematically analyze the security properties and fault boundaries of the proposed architecture. Under the classic Raft assumption f < N/2, our design preserves state consistency and business integrity for all honest nodes and prevents any single compromised node from unilaterally committing illegal logs or performing unauthorized isolation. Beyond this CFT envelope, the heterogeneous execution and dual-log evidence pipeline allow the system to detect and gradually remove malicious executors and nodes, while explicitly accepting that liveness may be degraded in the presence of stronger-than-CFT adversaries.
Third, we constructed a prototype system with five heterogeneous nodes and designed comprehensive experiments to empirically evaluate both security and performance. The results show that, as long as a majority of selected executors remain honest, the architecture achieves near-100% detection and isolation rates for the four representative attack classes considered in this paper, while maintaining more than 90% service availability under non-Byzantine faults. These results should be interpreted as conditional guarantees under the stated adversary model, not as unconditional Byzantine fault tolerance of the underlying consensus protocol.
Fourth, performance testing revealed the trade-off between security and efficiency in the architecture. In the baseline scenario without attacks, the architecture’s performance is comparable to that of conventional distributed systems; When under attack, the system exchanges controllable performance overhead (10.4–21.6% reduction in throughput) for correct business logic and extremely high security resilience. This trade-off is necessary and acceptable for security-critical domains such as financial transactions and industrial control.
Finally, we provide a runnable Docker-based prototype package that allows reviewers to observe the execution workflow and security-relevant behaviors of the proposed architecture without exposing internal source code. The complete implementation will be made available after acceptance in accordance with institutional policies, to facilitate further inspection and follow-up research, without affecting the scope of the current experimental evaluation. In summary, this research not only addresses the core flaws of traditional DHR architectures but also explores an effective path to transform the reliability guarantee capability of distributed consensus into the endogenous security capability of active defense systems. It provides an engineering-feasible solution for building the next-generation network defense infrastructure with high trustworthiness and measurability.