WebRTC Swarms: Decentralized, Incentivized, and Privacy-Preserving Signaling with Designated Verifier Zero-Knowledge Authentication

Rafał Skowroński

doi:10.3390/fi18010013

Abstract

Real-time peer-to-peer communication in web browsers typically relies on centralized signaling servers, creating single points of failure, privacy vulnerabilities, and censorship risks. We present WebRTC Swarms, a fully decentralized signaling architecture integrated into GRIDNET OS that combines onion-routed relay circuits with designated verifier zero-knowledge authentication and cryptoeconomic incentives. The proposed system empowers peers to discover and connect without exposing identities or IP addresses through an overlay of incentivized full nodes that carry signaling traffic using transmission tokens. We introduce a MAC-based designated verifier ZK authentication protocol allowing peers sharing a pre-shared key to mutually authenticate without revealing the key, ensuring only authorized participants can join sessions while preserving unlinkability to outsiders across sessions. Through formal verification using TLA+, we prove key safety and liveness properties of both the signaling protocol and the authentication mechanism. Empirical evaluation demonstrates near-100% NAT traversal success via incentivized decentralized TURN relaying (compared to approximately 85% for STUN-only approaches), join latencies under 2 s for swarms of dozens of peers, and strong resilience against Sybil and denial-of-service attacks through token-based rate limiting. Our work represents the first practical integration of decentralized WebRTC signaling with designated verifier cryptographic authentication and built-in economic incentives, providing a privacy-first substrate for secure, community-governed communication networks.

Keywords:

WebRTC; decentralized signaling; zero-knowledge proofs; onion routing; cryptoeconomic incentives; group signatures; peer-to-peer networks; privacy-preserving protocols

1. Introduction

Real-time communication technologies have become fundamental to modern internet infrastructure, yet they overwhelmingly depend on centralized signaling servers for session establishment and coordination. This architectural dependency introduces critical vulnerabilities: single points of failure that can disrupt service, privacy concerns from metadata exposure, and censorship opportunities for adversaries controlling signaling infrastructure.

WebRTC (Web Real-Time Communication) [1] has emerged as the dominant standard for browser-based peer-to-peer communication, enabling direct media exchange between endpoints without intermediary servers once connections are established. However, WebRTC’s initial signaling phase—where peers exchange Session Description Protocol (SDP) offers/answers and Interactive Connectivity Establishment (ICE) [2] candidates—traditionally requires a centralized signaling server. This server learns which users communicate with whom, when, and from what IP addresses, creating a comprehensive metadata profile vulnerable to surveillance and censorship.

1.1. Motivation

The primary motivation for this research stems from two critical requirements in the context of user interface decentralized applications (UI dApps) [3]:

(1) Cross-Browser Communication Without Centralized Signaling: UI dApps—browser-based decentralized applications operating within the GRIDNET OS ecosystem [3] (currently optimized for Chromium-based browsers, though the underlying protocol supports any WebRTC-compliant client) using standard web APIs (JavaScript/WebAssembly)—require the ability to establish peer-to-peer WebRTC connections for real-time audio, video, and data exchange without relying on centralized signaling servers. Traditional WebRTC deployments create architectural dependencies that contradict the decentralization principles fundamental to blockchain-based systems and decentralized operating systems. Our work addresses this gap by providing a fully decentralized signaling substrate that enables UI dApps to discover and connect with each other through a self-organizing overlay network of incentivized relay nodes.

(2) Privacy-Preserving Group Authentication: For private group communication scenarios, we require a mechanism where each participant possesses a pre-shared key (PSK), and peers can mutually authenticate their possession of this key without revealing it to eavesdroppers, network observers, or even other group members. Critically, no entity—including network intermediaries, passive observers, or man-in-the-middle attackers—should be able to learn the actual pre-shared key (PSK). Yet, two entities who independently possess the same PSK must be able to cryptographically prove to each other that they both hold it, enabling mutual authentication without PSK disclosure. This zero-knowledge property ensures that even if all network traffic is captured and analyzed, the pre-shared secret remains protected while still providing strong authentication guarantees. Our threat model explicitly accounts for adversaries attempting to extract, intercept, or infer the PSK through traffic analysis, replay attacks, or protocol manipulation.

1.2. Contributions

We present WebRTC Swarms, a novel architecture that eliminates centralized signaling dependencies while providing stronger privacy guarantees than existing peer-to-peer solutions. While our implementation is integrated into GRIDNET OS, the research contributions, algorithms, and protocols presented are universal and applicable to any WebRTC-based system requiring decentralized signaling and privacy-preserving group authentication. Our system makes three primary contributions:

(1) Decentralized, Incentivized Signaling Protocol: We implement a fully distributed signaling network where full nodes (participants in the GRIDNET OS network) form an onion-routed overlay that forwards SDP and ICE messages through multi-hop circuits. Distinction: Full nodes act as signaling relays (forwarding text-based protocol messages) and may optionally serve as media relays (TURN servers for actual audio/video/data packets when direct peer-to-peer fails). Each signaling message carries micro-payments (transmission tokens) that compensate relays. These tokens represent a unit of computational utility and can be exchanged for the network’s native currency, thereby aligning economic incentives with reliable forwarding behavior. For a detailed explanation of how Sybil-proof data incentivization schemes operate in this context, we refer readers to our prior work on SPIDE [4]. This design ensures no single relay can link message sources to destinations while providing economic disincentives against network abuse.

(2) Zero-Knowledge Group Authorization Protocol: We introduce a lightweight cryptographic protocol enabling anonymous yet accountable group participation. Users prove membership in an authorized group without revealing their identity or any information linkable across sessions. The protocol prevents unauthorized participants while preserving privacy for legitimate users, balancing anonymity with accountability through revocable credentials.

(3) Formal Verification and Empirical Validation: We provide TLA+ [5] specifications of both protocols with formal proofs of safety (unauthorized users cannot join) and liveness (authorized users eventually succeed). Our implementation in GRIDNET Core demonstrates practical performance: near-perfect NAT traversal success, sub-2-s join latencies for moderate swarm sizes, and effective resistance to resource exhaustion attacks through token-based throttling.

This work builds upon prior research in decentralized systems, anonymous credentials, and incentive mechanisms, synthesizing these domains into a cohesive architecture specifically tailored for real-time communication. We show that privacy, decentralization, and performance can coexist in production systems without sacrificing any dimension.

1.3. Implementation and Reproducibility

The protocols and algorithms described in this paper are not merely theoretical constructs—they have been fully implemented and deployed in production. The JavaScript implementation of the zero-knowledge authentication protocol is completely open source and verifiable, available in our public GitHub repository (GitHub repository: https://github.com/GRIDNETOS/GRIDNETOS (accessed on 18 December 2025), software version GRIDNET Core 1.9.1, accessed on 18 December 2025). The zero-knowledge proof state machine, as detailed in Section 7, is implemented in swarmconnection.js and actively used by the eMeeting UI dApp, which provides group video conferencing with real-time audio and video streams secured by our ZK authentication protocol. The system is publicly accessible at https://ui.gridnet.org (launch eMeeting UI dApp). During testing, developers can observe detailed logs of all group chat zero-knowledge authentication and authorization events in the web browser’s console, providing full transparency into the protocol execution. This open-source, production-ready implementation validates the practical viability of our theoretical contributions and enables independent verification and extension by the research community.

The remainder of this paper is organized as follows: Section 2 reviews related work. Section 3 formalizes the problem statement and defines key primitives. Section 4 presents the system architecture and participant roles. Section 5 defines our threat model and security assumptions. Section 6 and Section 7 detail the signaling protocol and zero-knowledge authorization protocol, respectively, including formal specifications and proofs. Section 8 analyzes the cryptoeconomic incentive layer. Section 9 presents experimental evaluation results, and Section 10 concludes with future directions.

2. Related Work

Our work intersects decentralized signaling for P2P networks, onion-routing anonymity networks, group signatures and anonymous credentials, and token-incentivized network economics. It serves as a practical, integrated instance of these distinct research areas, applied to the domain of real-time communication.

2.1. Decentralized Signaling and P2P Networks

Traditional WebRTC applications use centralized servers for signaling. Decentralized alternatives were explored: Matrix [6] provides a federated signaling network for VoIP but still relies on servers (homeservers) and does not prioritize anonymity. Jami [7] (Ring) is a fully distributed calling platform using DHT for peer discovery. Jami achieves serverless signaling via Kademlia DHT to look up callee addresses but does not incorporate onion routing—IP addresses can be exposed via DHT queries. WebDHT [8] attempted to create a browser-based DHT for signaling; Hyperswarm [9] (Dat’s discovery) replaced servers with distributed signaling. These achieve decentralization but not strong privacy guarantees (focus on scalability). Modular network stacks such as libp2p [10] have similarly addressed decentralized peer discovery, though often outside the browser constraint. Our approach differs by integrating privacy (onion routing) and economic sustainability (incentives) directly into the signaling layer, which is a novel combination in this context.

Proposals for browser-to-browser Tor integration exist. WebRTC Swarms share the goal of anonymity with Tor [11], but at the signaling level rather than the media level. Our design avoids the need for each peer to run a full onion client—it instead leverages full nodes as onion routers. This hybrid model is reminiscent of rendezvous points in Tor’s onion services [12]: the introduction point connects parties through Tor. Similarly, in our swarm, full nodes act as rendezvous proxies for peers.

2.2. Onion Routing and Anonymous Communication

Our onion routing implementation is conceptually similar to Tor and the original Onion Routing design [13]: layered encryption and multi-hop relays for anonymity. A key difference exists: WebRTC Swarms operate at the application layer specifically for signaling and is not a general internet proxy. This specialization allows optimizations: signaling messages are relatively small/frequent, so micropayments per message are feasible (but might be too slow for every Tor packet). Also, the hop count is not fixed to three as in Tor; instead, it is dynamic (flood for join, then likely one-hop or direct for subsequent if reachable). The dynamic approach was chosen to ensure propagation to all in the group; anonymity is still preserved by source identifier hiding.

Timing correlation attacks remain a threat as in any low-latency network. While Tor has been extensively studied in this regard [14], our system might need similar scrutiny. However, group signaling traffic has different patterns (bursty around join events, then mostly quiet except ICE keepalives), perhaps making correlation harder.

2.3. Group Authorization and Credentials

Our zero-knowledge group auth draws inspiration from group signatures (Chaum and van Heyst 1991 [15], BBS04 [16]) and anonymous credential systems (Idemix [17], U-Prove [18]). Group signatures allow a member to sign messages on behalf of a group anonymously, with the group manager able to open if needed. The viewing swarm join authorization is a “message” (“I am member”); our ZK proof protocol is analogous to a group signature on nonce. However, we do not implement a full cryptographic group signature (heavy math, actual group public key). Instead, we use shared secrets or dual signatures with local verification; this method is simpler but requires trust in common secrets or the prior key distribution.

Academic works like Dissent [19] and Riposte [20] tackled accountable anonymous group communication using DC-nets or verifiable shuffles. Dissent provided anonymity within groups and could blame misbehaving members via a verifiable blame protocol. Our approach is more lightweight (symmetric crypto) and more practical for real-time media groups than heavy DC-nets, which do not scale well for low-latency.

Revocation mechanisms in anonymous systems have been studied (e.g., verifier local revocation in group sigs, where each verifier holds a revocation list and the user proves non-membership). Integrating such advanced cryptography would enhance our scheme by allowing mid-session eviction without rebooting the whole swarm secret—this is left for potential future work.

2.4. Cryptoeconomic Incentives

The idea of incentivizing network nodes with micropayments has precedents. Spore (incentive system for Tor) proposed one-time-use tokens to prioritize traffic for paying users. TorCoin (2014) [21] proposed a Bitcoin-based reward for Tor relay throughput; more recently AMP (Constellation 2021) [22] proposed probabilistic payments for mixnet nodes. Our MDTP scheme similarly rewards relays and charges users but has a built-in protocol rather than one that is bolted on. It is specifically tailored to signaling, where economic cost is finely metered (each SDP message). Its granularity is comparable to schemes like NanoPay [23] or Spritz [24] micropayments for API usage.

Proof-of-Work (PoW) puzzles are a related concept in email spam prevention (Hashcash [25]). We chose tokens (representing pre-mined or stake-based currency) instead of PoW because PoW would burn CPU and be slow on browsers. Tokens allow leveraging existing blockchain/currency without computation of each message (just signature/hash check).

The security-through-economics model is aligned with Bitcoin/Ethereum networks: spam is deterred by transaction fees. We apply a similar notion to signaling messages, drawing parallels to flow control in networking—typically solved by technical means, whereas we solve by market means. This is a relatively novel approach for real-time communications. Success hinges on choosing the right price such that normal usage is affordable, but attacks are expensive—exactly analogous to blockchain fee mechanisms (e.g., Ethereum gas price markets [26]). Our dynamic pricing suggestion mirrors Ethereum’s EIP-1559 [27] fee escalation.

2.5. Prior Work on GRIDNET OS

This work builds on foundational research by Skowronski et al. [4] on GRIDNET OS and SPIDE (Sybil-Proof, Incentivized Data Exchange). SPIDE introduced a framework for incentivized data transmission where data streams carry micropayments compensating relays, providing economic Sybil resistance. WebRTC Swarms extend these concepts to the real-time signaling domain, showing how signaling datagrams can board transmission tokens incentivizing their exchange, and how entire data streams can be incentivized as described in SPIDE. The integration of decentralized access rights provisioning and governance hooks in GRIDNET enables community-operated signaling infrastructure where token parameters, access policies, and software updates are decided through on-chain voting, ensuring long-term adaptability without centralized control.

3. Problem Formulation and Definitions

3.1. The Decentralized WebRTC Signaling Problem

Problem Statement: Given a set of peers

P = {p_{1}, p_{2}, \dots, p_{n}}

wishing to establish WebRTC connections for real-time communication, design a protocol that enables SDP offer/answer exchange and ICE candidate dissemination without relying on centralized signaling servers, while satisfying the following requirements:

(R1) Decentralization: No single entity controls message routing or possesses complete knowledge of network topology.

(R2) Privacy: No intermediary node can learn both the source and destination of a signaling message (unlinkability).

(R3) Liveness: Honest peers possessing valid credentials eventually complete signaling and establish connections.

(R4) Safety: Unauthorized peers cannot participate in protected swarms.

(R5) Sybil Resistance: Adversaries cannot spawn unlimited identities to overwhelm the network.

(R6) Incentive Compatibility: Relay nodes are economically motivated to forward messages honestly.

3.2. Designated Verifier Zero-Knowledge Authentication Problem

Problem Statement. Design a mutual authentication protocol between two peers that share a pre-shared secret K such that (i) only parties with K can generate accepting proofs (soundness); (ii) to any outsider without K, protocol transcripts reveal no information about K nor link distinct sessions (indistinguishability and unlinkability); (iii) to the designated verifier (who knows K), proofs are non-transferable because the verifier can simulate transcripts on its own (DV–ZK).

Construction Overview. Using a keyed MAC with context string

ZKAuth - v 1

,parties exchange fresh nonces

N_{B}

and

N_{A}

, and the candidate responds with

Z = {MAC}_{K} (ZKAuth - v 1 ‖ N_{A} ‖ N_{B})

. The validator accepts if and only if recomputation matches.

Definition 1

(EUF–CMA Soundness). A protocol is computationally sound if for any probabilistic polynomial-time (PPT) adversary

A

not in possession of the secret key K, the probability that

A

can successfully convince an honest verifier is negligible in the security parameter λ (i.e., smaller than the inverse of any polynomial function of λ), under the assumption that the MAC is existentially unforgeable under chosen-message attack (EUF–CMA), consistent with standard security arguments for digital signatures [28].

Formally, for any PPT

A

:

Pr [Verify (transcript) = accept ∣ A does not know K] \leq negl (λ)

Definition 2

(Outsider Indistinguishability). For any PPT distinguisher

D

that does not know K, the advantage in distinguishing real protocol transcripts from simulated ones (with uniformly random tags) is negligible in λ.

Definition 3

(Outsider Unlinkability). Given two protocol transcripts

τ_{1}, τ_{2}

, no PPT adversary without K can determine if they originated from the same peer with an advantage better than random guessing:

Pr [A (τ_{1}, τ_{2}) = s a m e_p e e r] \leq \frac{1}{2} + n e g l (λ)

Note: This definition applies to the cryptographic protocol transcript. Network-level unlinkability is provided separately via the onion-routed overlay.

Definition 4

(Designated Verifier Zero-Knowledge/Non-Transferability). For any PPT distinguisher

D

that holds the verifier’s secret K (designated verifier setting), there exists a simulator

S (K)

that outputs transcripts identically distributed to real protocol executions. This implies that the designated verifier can simulate transcripts perfectly, making proofs non-transferable to third parties.

Definition 5

(Timing-Secure Authentication). The protocol enforces the timing constraint

T_{2}

such that

No authentication succeeds if proof arrives at time $> T_{2}$ after the challenge is issued (prevents delayed responses);

Note: Timer $T_{1}$ is an optional throttle for usability, not a security requirement. Replay resistance comes from fresh nonces and MAC verification.

3.3. Cryptographic Primitives

Keyed MAC. We instantiate message authentication with

{KMAC}_{256}

(NIST SP 800–185) and a fixed context string

ZKAuth - v 1

to provide domain separation:

{MAC}_{K} (m) : = {KMAC}_{256} (K, ZKAuth - v 1 ‖ m, 256) .

We assume standard EUF–CMA and PRF security of

MAC

.

Hash Function (for token chains only). Where an unkeyed hash is needed (e.g., reverse hash chains for token pools), we use SHA3–256, modeled as a random oracle

H : {0, 1}^{*} \to {0, 1}^{256}

for those specific constructions.

Key Derivation (optional). If K originates from a user password, we derive a high-entropy key

K_{auth} \leftarrow Argon 2 id (password, salt, mem, iters)

and use

K_{auth}

in place of K; we store the derived key

K_{auth}

for computing

MAC

operations and not only its hash image.

Token Pool: A reverse hash chain

T = (Γ_{1}, Γ_{2}, \dots, Γ_{T_{C}})

where each token is computed as

Γ_{i} = H (relayID | | swarmID | | S_{H} | | Γ_{i - 1})

for secret seed

S_{H}

. This binds redemption to a specific relay and swarm, preventing cross-relay double-spending. Each token

Γ_{i}

represents a value

Γ_{V} = \frac{T_{H}}{T_{C}}

.

Transmission Token: A sequence of consecutive tokens

(Γ_{j}, Γ_{j - 1}, \dots, Γ_{j - k})

is revealed in descending order for micropayments. Each relay maintains a set

seenTokens [relay]

to enforce one-time use per relay: a token

(relayID, chainID, index)

is accepted at most once per relay.

3.4. Complexity Analysis

Communication Complexity: For a swarm of size n, each new join requires

O (n)

signaling messages (one SDP exchange per existing peer). With onion routing through h hops, each message incurs

O (h)

forwarding overhead. There is a total of

O (n \cdot h)

messages per join.

Computational Complexity: Zero-knowledge proof computation requires three SHA3-256 operations (one for password image, one for proof generation, one for verification). Each operation’s complexity is proportional to the size of its input,

O (| K | + | N_{A} | + | N_{B} |)

. For a 256-bit key and nonces, this is constant time. For an n-peer swarm, the total authentication cost is

O (n)

hash operations.

Cryptographic Security: Under the random oracle model, breaking soundness via a preimage attack requires an expected

2^{256}

hash queries (brute force), providing

λ = 256

-bit security.

4. System Architecture

The WebRTC Swarms architecture consists of two primary participant types: lightweight peers (end-user clients, typically browsers) and full nodes (GRIDNET Core instances that form the signaling overlay). Peers do not connect directly to a central server; instead, they connect to one or more full nodes via WebSocket to join the decentralized network.

4.1. Network Topology

Full nodes collectively maintain the state of active swarms (analogous to chat rooms or conference sessions, each identified by a unique hash) and route signaling messages among themselves and to local peers. The full node network forms a peer-to-peer overlay that implements onion routing: messages are wrapped in multiple layers of encryption and forwarded through multiple hops, ensuring that no single intermediate node learns both the source and destination of a communication.

When a peer wishes to create or join a swarm, it provides a swarm identifier to its connected full node (in Figure 1). The node creates or updates a CWebRTCSwarm instance to track this swarm’s members, each identified by an ephemeral agent ID generated per-swarm-session. This ephemeral identity ensures peers cannot be tracked across different swarms or sessions.

Figure 1. Simplified topology showing peers connecting through full nodes that form an onion-routed overlay for signaling. Multi-hop circuits (e.g., X → Z → Y) prevent any single node from linking sender to receiver.

4.2. Swarm Creation and Routing

When a peer initiates a join request, the request propagates through the overlay in a broadcast fashion to ensure all existing swarm members become aware of the newcomer. Each full node that handles the message adds a layer of encryption and increments a hop counter before forwarding to prevent infinite loops (with a default maximum of 100 hops). The original sender’s identity is not revealed beyond the first hop—full nodes forward messages with source identifiers left empty or replaced with their own transient identifiers.

Each signaling message is encapsulated in a CNetMsg with type metadata and routing information. Full nodes use a distributed routing table (CDataRouter) to decide forwarding paths. For example, when a swarm is created or a member joins, nodes update their routing tables such that messages addressed to that swarm ID or specific peer ID are routed toward the appropriate conversation endpoint, either locally or via neighbor nodes.

4.3. Micropayment-Incentivized Relaying

To discourage malicious or lazy relays and handle open participation, WebRTC Swarms integrate token pools and transmission tokens—cryptographic payment structures as defined in SPIDE (Sybil-Proof Incentivized Data Exchange) [4].

A token pool (

T

) is a reverse hash chain construction registered on the blockchain, where each identity possesses a pool containing a sequence of hash values

Γ_{1}, Γ_{2}, \dots, Γ_{T_{C}}

, where

T_{C}

is the pool’s capacity. Each hash is computed as

Γ_{i} = H (S_{H} | | Γ_{i - 1})

from a secret seed

S_{H}

, where

H

denotes SHA3-256. The pool represents a store of cryptocurrency value, with each unitary hash (

Γ

) corresponding to a fractional amount of the total pool value (

Γ_{V} = \frac{T_{H}}{T_{C}}

, where

T_{H}

is the total pool value).

Before sending a signaling message, the peer’s full node retrieves one or more transmission tokens from the token pool. A transmission token (

Γ

) is a sequence of consecutive hash values from the token pool, revealed in descending order (e.g.,

Γ_{100}, Γ_{99}, Γ_{98}, \dots

) [4]. When a peer reveals

Γ_{i}

from their pool, any intermediate relay can later cash out the payment by registering these revealed tokens on-chain, proving possession of valid hash values from the authenticated pool. Because hash chains are one-way functions, revealing

Γ_{i}

does not compromise future tokens

Γ_{j}

where

j > i

, but does prove expenditure of pool value. If the pool is depleted (

T_{U} = T_{C}

, where

T_{U}

is the index of the last used token), the message cannot be sent, preventing Sybil attackers from spawning unlimited free requests [4].

Relays accumulate transmission tokens from multiple forwarding operations and are incentivized to batch-register them on-chain to amortize transaction costs. Verification requires checking that revealed hash values belong to a legitimate, registered token pool and that they have not been previously spent (tracked via

T_{U}

in the blockchain state). Token Validation: Relays perform optimistic local checks against the known token pool root and periodically settle/verify on-chain to detect double-spending. This mechanism serves dual purposes: authentication (possession of valid tokens from a registered pool implies stake) and incentivization (relays earn micropayments for forwarding). Token costs create financial deterrents against denial-of-service attacks while compensating honest nodes [4].

4.4. Media Path and NAT Traversal

Importantly, WebRTC Swarms relays are primarily used for signaling and coordination, not necessarily for relaying media/data streams. After signaling completes, peers attempt direct peer-to-peer connections via ICE (using STUN [29]/TURN [30] as usual). The full node network also acts as distributed STUN/TURN servers when direct connectivity fails: if peers cannot establish direct ICE paths, they can route encrypted WebRTC packets through the full-node network as a relay of last resort.

This design provides decentralized NAT traversal assistance by allowing the network of incentivized full nodes to act as a distributed set of TURN relays. Rather than relying on a centralized TURN server, any full node on a signaling route can forward encrypted media packets as a last resort if a direct peer-to-peer connection fails. Our experiments show this approach achieves near-100% connectivity compared to ∼85% success rates for STUN-only deployments, significantly improving reliability even under restrictive NAT configurations.

It is important to note the tradeoff between performance and anonymity. If peers successfully establish a direct P2P connection via ICE, they inevitably expose their IP addresses to one another. For scenarios requiring strict anonymity between participants (not just from outsiders), peers must force the usage of the full-node relay network, thereby masking their IP addresses at the cost of higher latency.

5. Threat Model

We assume an adversarial environment with multiple threat categories.

5.1. Colluding Relays

Onion-routed signaling ensures no single relay knows the complete message path. An entry (guard) relay sees the true origin but not the final destination; an exit relay sees the destination but not the origin. An adversary controlling both the entry (guard) and exit relay on a circuit can link the sender to the receiver through timing analysis. While the protocol functions with a single relay, a minimum of three independent full nodes (entry, middle, exit) is recommended to ensure effective onion-routing anonymity properties against colluding adversaries. The resistance to this attack is therefore probabilistic and depends on the adversary controlling a sufficiently small fraction of the total full-node population, making it unlikely for a user to randomly select a compromised circuit. We mitigate this through network size and (in future work) random multi-hop circuit selection. Identity-hiding measures (blank source IDs for forwarded joins) ensure remote nodes see newcomers as abstract identities unlinkable to global personas.

Relays cannot tamper undetectably with messages because all messages are signed or integrity-protected. The use of cryptographic signatures means any alteration invalidates the signature, allowing detection. Thus, confidentiality and integrity of signaling are assured against minority-malicious nodes.

5.2. Global Traffic Analysis

Similar to Tor, WebRTC swarms are not immune to a global adversary monitoring all network traffic. End-to-end timing correlation attacks could potentially link entry and exit traffic if observable with fine granularity. We assume in our threat model that the adversary cannot monitor every network link simultaneously. Importantly, even if timing correlation hints at session correlation, the adversary does not learn message content (encrypted end-to-end) nor actual user identities (only ephemeral IDs and token signatures).

5.3. Sybil Attacks and Spam

Without precautions, attackers could spawn countless identities to snoop or flood the network. We counter this using the following:

Economic costs: Each message requires a token; spamming signals requires proportional token expenditure. Since tokens have real scarcity (blockchain-based or stake-backed), Sybil attackers must invest linearly increasing resources.
Cryptographic authentication: Peers must present valid credentials (zero-knowledge proofs of group membership) to actually join swarms. Without credentials, Sybil identities cannot pass Stage 2 authorization.
Rate limiting: Full nodes throttle join requests per peer (e.g., maximum one join per 5 s), preventing rapid rejoin cycling.

Bootstrapping (Cold Start): We assume that an out-of-band bootstrapping mechanism (e.g., a faucet or invite code) provides new users with an initial micro-balance of tokens to perform their first handshake. This is a standard assumption in token-based systems and prevents legitimate new users from being locked out while maintaining Sybil resistance through subsequent token costs.

5.4. Malicious or Unauthorized Peers

The primary goal is to keep unauthorized users out of actual communication. Our two-stage join process handles this: Stage 1 (signaling stage) is open—join messages are forwarded to all members (avoiding false negatives). However, unauthorized peers will not pass Stage 2 (authorization stage), where existing members demand valid zero-knowledge proofs of group membership. Peers failing authentication are kept in “limbo”—connections technically established but only dummy data are shared until proof succeeds or timeout expires.

5.5. Pre-Shared Key Protection and Zero-Knowledge Authentication

A critical threat addressed by our system is the protection of pre-shared keys (PSK) used for private group authentication. We assume adversaries with the following capabilities:

Network Eavesdropping: Passive adversaries monitoring all network traffic between peers, including WebSocket connections to full nodes and WebRTC data channels.
Traffic Analysis: Active adversaries attempting to infer the PSK through statistical analysis of message patterns, timing, sizes, or frequency.
Replay Attacks: Adversaries capturing and replaying authentication messages to gain unauthorized access.
Man-in-the-Middle: Active attackers intercepting and potentially modifying messages between peers.
Compromised Relays: Malicious full nodes attempting to extract PSK information from signaling messages.

Our designated verifier zero-knowledge authentication protocol (Section 7) ensures that:

1.: The PSK is never transmitted over the network—not even in encrypted form between legitimate peers.
2.: If K originates from a user password, a high-entropy derived key $K_{auth}$ (via Argon2id) is stored for MAC operations, not the original password.
3.: Authentication exchanges involve ephemeral nonces ( $N_{A}$ , $N_{B}$ ) and MAC-based authentication tags ( $Z = {MAC}_{K} (ZKAuth - v 1 | | N_{A} | | N_{B})$ ), ensuring no two authentication sessions produce identical traffic patterns.
4.: The timing constraint $T_{2}$ enforces freshness for tag delivery, preventing delayed responses. Fresh nonces and MAC verification provide replay resistance (Timer $T_{1}$ is an optional throttle, not security-critical).
5.: Even if an adversary compromises a full node or intercepts all traffic, they learn only random-appearing MAC tags that provide zero information about K to outsiders without the key (under PRF security of the MAC).
6.: Two entities possessing the same K can independently compute matching MAC tags and thereby prove mutual possession without revealing the key itself.

Designated Verifier Property: All indistinguishability and unlinkability claims apply to observers without K. A designated verifier holding K can simulate transcripts, hence proofs are non-transferable to third parties.

This threat model explicitly accounts for scenarios where adversaries attempt to extract, intercept, or infer the PSK through any means short of compromising the end-user devices where the PSK is initially entered.

5.6. Side-Channel Attacks

While the zero-knowledge authentication protocol provides strong cryptographic guarantees, side-channel attacks remain a concern in practical deployments:

5.6.1. Timing Side-Channels

Protocol-Level Timing: The ZK authentication protocol explicitly uses timing delays (Timer1 = 10 s, Timer2 = 3 s) for replay protection. While these timing constraints prevent replay attacks, they could potentially leak information about protocol phases. However, the coarse granularity (seconds rather than milliseconds) and the fixed, public nature of these timers mean they reveal only that the protocol is executing correctly, not information about the secret K.

Computational Timing: Hash computation time for SHA3-256 is data-independent and constant-time in standard implementations, preventing timing attacks on the proof generation step

Z = SHA 3 - 256 (K | | N_{A} | | N_{B})

. However, implementations must ensure that comparison operations (when validator checks

Z = Z^{'}

) use constant-time comparison functions to avoid timing leaks. Variable-time comparisons could leak partial information about the hash output through timing measurements.

Network Timing: Round-trip times (RTT) between peers leaking geographic proximity or network topology are inherent to any real-time communication system. Our onion-routed signaling adds variable latency through multi-hop paths, providing some obfuscation of direct peer-to-peer RTT. However, statistical analysis of timing patterns across multiple sessions could potentially reveal persistent peer relationships.

5.6.2. Metadata Leakage

Message Size: All protocol messages have relatively uniform, small sizes. Join messages, offers, answers, and authentication proofs are padded to standard sizes where possible. SHA3-256 outputs are fixed at 256 bits (32 bytes). However, SDP offers and ICE candidates vary in size depending on network conditions (number of candidates, interface types). This variation could leak information about peer network configuration but does not reveal secret credentials.

Message Frequency and Patterns: The protocol exhibits characteristic patterns: join broadcast → multiple offer/answer pairs → ICE exchanges → authentication. Observers monitoring message frequencies could distinguish protocol phases. We do not currently implement traffic shaping or dummy messages to obfuscate these patterns, as the performance cost would be significant for real-time communication. This is a deliberate tradeoff: pattern visibility does not compromise credential secrecy or authentication soundness.

Swarm Membership Inference: An adversary controlling multiple full nodes could correlate join messages across nodes to infer swarm membership size and activity patterns. Defense: peers can connect through different full nodes for different swarms, and join messages use ephemeral session identifiers that change per attempt.

5.6.3. Traffic Analysis

Entry–Exit Correlation: As discussed in Section 5.1, an adversary controlling both entry and exit relays can perform timing correlation attacks. While we cannot eliminate this risk without unbounded latency, the probabilistic defense (random relay selection, network size) makes successful attacks require controlling a significant fraction of full nodes—costly under our cryptoeconomic model.

Long-Term Statistical Attacks: Adversaries collecting timing data across many sessions could build statistical models linking peers. Countermeasures include: (1) varying relay path selection per session, (2) introducing random delays at relay hops (not currently implemented, as it degrades user experience), (3) cryptoeconomic costs making mass surveillance prohibitively expensive.

5.6.4. Mitigation Summary

Our threat model acknowledges these side-channel risks but prioritizes practical deployability:

Constant-time implementations for cryptographic primitives (SHA3-256, comparisons);
Coarse timing granularities (seconds not milliseconds) for protocol timers;
Onion routing with multi-hop paths adds timing noise;
Cryptoeconomic costs make mass traffic analysis expensive;
Message padding where practical (authentication proofs, nonces);
Ephemeral identifiers prevent long-term tracking across sessions;
Scope limitation: the model addresses technical network threats. Non-technical attacks, such as social engineering or phishing to obtain the PSK from a user, are outside the scope of this protocol’s cryptographic guarantees.

We explicitly do not claim resistance to sophisticated traffic analysis by well-funded global adversaries (nation-state level). For such threat models, additional countermeasures (cover traffic, constant-rate shaping, mix networks) would be necessary but come with substantial performance costs incompatible with real-time WebRTC communication. The system’s security model targets protection against eavesdropping, unauthorized access, and moderate-scale surveillance, while accepting that determined attackers with visibility into multiple network vantage points may infer some metadata about communication patterns.

5.7. Censorship and Reliability

The swarm overlay’s decentralized architecture provides inherent censorship resistance through several mechanisms:

Universal Bridge Capability: Unlike systems with dedicated bridge nodes, every GRIDNET Core node functions as a potential bridge. Nodes discover each other through a variant of the Kademlia Distributed Hash Table (DHT) protocol, ensuring that no single node maintains complete knowledge of the network topology. The topology changes dynamically as operators join and leave the network, making it difficult for adversaries to enumerate or block all entry points.

Mobile Participation: The network includes mobile applications (iOS/Android) that actively participate in signaling dissemination. These mobile nodes typically use dynamic IP addresses assigned by cellular carriers, further complicating attempts to create comprehensive blocklists.

Asynchronous, Event-Driven Architecture: When the in-web-browser subsystem of GRIDNET OS is launched, it receives a list of randomly selected GRIDNET Core nodes discovered by peers serving the Decentralized User Interface (DUI) session. These node IP addresses are delivered through end-to-end encrypted WebSocket connections using BER encoding. The Swarms API-compatible UI dApps then establish connections to these nodes. All communication between UI dApps, the in-browser GRIDNET OS subsystem, and signaling nodes is asynchronous and event-driven, allowing peers to opportunistically connect through any available path.

Traffic Indistinguishability: Swarm messages are encrypted and designed to be indistinguishable from standard encrypted web traffic to external observers, though we acknowledge that sophisticated statistical analysis of packet timing and size remains a theoretical vector for traffic classification. Censors cannot selectively target specific swarms without blocking all encrypted traffic through the network.

Users can operate their own full nodes or connect through alternative routes if specific nodes are blocked. The combination of Kademlia-based discovery, mobile participation, and universal bridge capability ensures that blocking access requires censoring a large, constantly changing set of IP addresses across multiple network types (residential, datacenter, mobile).

6. Signaling Protocol Design

The WebRTC Swarm signaling protocol extends traditional WebRTC offer/answer exchanges into a multi-party, multi-hop context. We formalize the protocol phases and define state machines for full nodes and peers.

6.1. Core Message Types

Signaling messages are encapsulated in CSDPEntity structures with types:

joining: Peer’s request to join a swarm (broadcast to all participants);
getOffer: Request a specific peer to generate an SDP Offer;
processOffer: SDP Offer from one peer to another;
processOfferResponse: SDP Answer in response to Offer;
processICE: ICE candidate message carrying network candidate;
control: Control/status messages (confirmations, errors, membership updates).

These messages are encrypted and routed by full nodes, which treat most as opaque payloads except for certain control messages that they handle internally.

6.2. Join Phase Protocol

In Algorithm 1, when peer A (new joiner) starts in state NotJoined, it sends a joining message to its full node X. Node X performs validation (validate() checks well-formedness), ensures swarm is operational, then:

1.: Adds A to member list with unauthenticated status;
2.: Checks rate limit (prevents flooding);
3.: Broadcasts getOffer requests to all other local members;
4.: Sends confirmation control message back to A;
5.: Forwards joining to other full nodes (flooding with hop-count increment).

Algorithm 1 Join phase (broadcast)

1:: Input: New peer A wants to join swarm S
2:: A sends joining message to connected full node X
3:: X validates message and checks if A already authenticated
4:: if A not present or unauthenticated then
5:: Check rate limit: if $(NOW - lastJoin [A]) < 5 s$ then abort
6:: lastJoin[A]← now
7:: Attach dedup fields: generate fresh $msgID$ and set $ttl \leftarrow TTL_MAX$
8:: Broadcast getOffer $(msgID, ttl)$ to all local members of S
9:: Send control:joined confirmation back to A
10:: Forward joining $(msgID, ttl)$ to all neighbor nodes in overlay
11:: end if

Every full node receiving the join delivers it to local peers in that swarm and forwards further if applicable, ensuring network-wide awareness.

Scalability Note: While the current implementation uses managed flooding (suitable for small-to-medium overlays), this ensures maximum reachability with

O (N)

message complexity where N is the number of full nodes. Future iterations may optimize this with DHT-based routing for global scale.

6.3. Offer Phase Protocol

After broadcasting the join, the protocol enters pairwise SDP exchanges (Algorithm 2). For each existing participant B, their full node Y delivers a getOffer request, prompting B to

1.: Transition to PreparingOffer state;
2.: Generate WebRTC offer via RTCPeerConnection.createOffer();
3.: Wrap offer as processOffer SDP entity;
4.: Route through overlay to A’s node X.

Node X delivers the offer to A, which sets it as a remote description and immediately generates an answer. This answer is routed back to B via Y. After completing offer/answer, both peers enter the ICE gathering state.

This procedure repeats concurrently between newcomer A and every existing peer. Importantly, A always acts as answerer (never offerer) to avoid glare problems and simplify role assignment.

Algorithm 2 Offer /answer exchange

1:: Input: Peer B receives getOffer for new peer A
2:: B creates RTCPeerConnection for A (if not exists)
3:: B generates SDP offer
4:: B sends processOffer(A, offer) to swarm
5:: When A receives offer:
6:: A creates pending connection with B
7:: A sets remote description = offer
8:: A generates SDP answer
9:: A sends processOfferResponse(B, answer) to swarm
10:: When B receives answer:
11:: B sets remote description = answer
12:: B transitions to iceExchange state

6.4. ICE Candidate Exchange

Once offers and answers are exchanged, peers gather ICE candidates and exchange them over the swarm overlay. ICE messages are always unicast to the specific target peer. Full nodes route ICE similarly to offers: if target peer is local, deliver directly; otherwise forward to appropriate neighbor. The code attempts local delivery first, only routing remotely if peer not found locally.

6.5. State Machines

6.5.1. Full Node State Machine

For each swarm, a full node maintains

Members: Set of members with authentication status;
PendingSDP: Queue of incoming messages to route
Routing table entries for swarm/peer IDs.

State transitions occur on message receipt:

On joining: If member already authenticated, neglect (duplicate spoof). If not present or unauthenticated: add/update member with status=unauthenticated, check rate limit, broadcast getOffer to local members, send control:joined confirmation, forward to neighbors.
On processOffer: Attempt local delivery to destination peer. If not local, forward via routing table.
On processOfferResponse: Similar to offer—deliver locally if possible, else forward.
On processICE: Find destination peer. If local, deliver directly. If not, forward to remote node. Check comment: “improve—do so only if local connection was authenticated… otherwise we open up to spoofing”—indicating an opportunity for security enhancement.
On control: Log and update member states as needed (e.g., if control says that the peer is banned, mark or remove them).

6.5.2. Peer State Machine

Peer A (joiner):

Idle → Joining: Calls joinSwarm(), sends joining;
Joining → JoinConfirmed: Receives control:joined, marks joinConfirmed=true;
JoinConfirmed → Connecting: Receives offers from existing peers, creates pending connections, sends answers;
Connecting → Connected: After ICE completes, establishes WebRTC connections.

Peer B (existing member):

Stable → OfferSending: Receives getOffer, creates connection for new peer, sends offer;
OfferSending → WaitingAnswer: Offer sent, awaits answer;
WaitingAnswer → ICEExchange: Answer received, sets remote SDP;
ICEExchange → Connected: ICE succeeds, connection established.

6.6. Ephemeral Session Identifiers

Each offer/answer negotiation has a unique SDP session ID (random 64-bit) tracked per connection. Messages with mismatched session IDs are rejected, preventing replay attacks. Similarly, each join request carries a sequence number (seqNr) to distinguish attempts by the same peer. These ephemeral identifiers ensure stale messages from prior sessions cannot disrupt current sessions.

6.7. Timing and Ordering Guarantees

The protocol does not require strict global ordering (mostly asynchronous), but within a given pair negotiation, certain orderings are enforced: an answer is only accepted if a matching offer was seen first. ICE candidates may arrive before offer/answer (trickle ICE) but are cached until appropriate stage.

Use of hop-by-hop seen message caches (getWasDataSeen) ensures duplicates (if overlay floods cause multiple copies via different routes) are ignored, providing at-least-once delivery with deduplication.

Invariant 1: For any two honest peers A and B, at most one WebRTC negotiation is active between them at any time (enforced by locking and timestamps).

Invariant 2: A peer has exactly 0 or 1 identity in a swarm. The group authorization design (next section) ensures peers cannot present multiple identities simultaneously.

6.8. Stale Connection Detection

If a connection is not fully connected within ∼45 s of signaling, it is closed and removed. This threshold balances ICE path discovery time with resource cleanup needs, preventing half-open handshakes from lingering indefinitely.

7. Designated Verifier Zero-Knowledge Authentication via a PSK-Keyed MAC

Once the signaling handshake connects peers, WebRTC Swarms enforce an authorization step before permitting actual payload communication. The goal is to ensure that only peers possessing valid group credentials can participate meaningfully, without revealing identities or linking participation across sessions. The following protocol is a designated verifier zero-knowledge authentication via PSK-keyed MAC. It is a three-round protocol where the validator and candidate exchange nonces and MAC-based authentication tags. It satisfies the definitions of EUF-CMA soundness, outsider indistinguishability, outsider unlinkability, and designated verifier zero knowledge (non-transferability), ensuring that only participants who know the secret K can gain access, without revealing K itself, while maintaining unlinkability to outsiders across sessions.

7.1. Credential Issuance and Lifecycle

For closed/private swarms, a trusted Group Manager (or decentralized CA via blockchain governance) issues credentials to authorized users. This credential could be a digital group signature key, anonymous credential (e.g., Camenisch–Lysyanskaya style), or knowledge of a pre-shared secret.

In our implementation’s current form, the credential is a 256-bit pre-shared secret K. All authorized members know K in advance (distributed out-of-band or via secure invite mechanism). Peers do not transmit K directly; instead, they prove knowledge by deriving values from it.

7.2. Bootstrap (Join) Phase with Credentials

When a user attempts to join a closed swarm, they must provide credential data to the full node with a join request. In GRIDNET’s joinSwarm API, the user passes userID and privKey. The combination of a transmission token and a potential signature provides cryptographic assurance of credential possession.

For maximum privacy, we aim for zero-knowledge proof: the joiner presents proof of credential to full node (or other peers) at join time. Our implementation uses a password image scheme: each peer setting the swarm’s secret key computes hash mPasswordImage = sha3_256(pass) and stores it. When peers connect, they compare these password images or perform a mutual check.

7.3. Protocol State Machine

We describe authorization handshake between peers A and B after the WebRTC data channel is established. Initially, channel carries only dummy data. Each peer stores a passwordImage (secret hash).

For mutual authentication, each peer maintains two instances of this state machine, one as a validator and one as a candidate (Algorithm 3). Connection states are

Unauthenticated: Connection open but not verified;
AuthChallengeSent: Sent challenge and awaiting response;
AuthChallengeReceived: Received partner’s challenge and needs to respond;
Authenticated: Both sides are verified.

Algorithm 3 Designatedverifier ZK authentication via PSK-Keyed MAC.

1:: Inputs: Peers A (validator) and B (candidate); shared secret K (or derived $K_{auth}$ from a memory-hard KDF); context string $ZKAuth - v 1$
2:: Constants: $T_{2}$ (freshness timeout). Optional throttle $T_{1}$ (not security-critical)
3:: Phase 1 (Candidate nonce): B samples $N_{B} \overset{$}{\leftarrow} {0, 1}^{256}$ and sends AuthInit $(N_{B})$ to A
4:: Phase 2 (Validator nonce): A samples $N_{A} \overset{$}{\leftarrow} {0, 1}^{256}$ , records start time $τ_{2}^{A} \leftarrow NOW$ and sends Challenge $(N_{A})$
5:: Phase 3 (Response): B computes $Z \leftarrow {KMAC}_{256} (K, ZKAuth - v 1 ‖ N_{A} ‖ N_{B}, 256)$ and sends Response $(Z)$
6:: Verification: If $(NOW - τ_{2}^{A}) > T_{2}$ then reject. Compute $Z^{'} \leftarrow {KMAC}_{256} (K, ZKAuth - v 1 ‖ N_{A} ‖ N_{B}, 256)$ . Accept if and only if $Z = Z^{'}$ .
7:: Output: $\in {AUTHENTICATED, UNAUTHORIZED}$

The protocol employs an asymmetric validator–candidate model with critical timing constraints to prevent replay attacks:

Phase 1 (Initiation, ∼0 ms): Validator A initiates authentication by sending an empty request. Candidate B responds with 256-bit random nonce

N_{B}

and starts

τ_{1}^{B}

(Timer1). This establishes B’s commitment to freshness before learning A’s challenge.

Phase 2 (Validator Nonce): Validator A samples its 256-bit random nonce

N_{A}

, records start time

τ_{2}^{A}

(Timer2), and sends Challenge

(N_{A})

to candidate B. Validator A may optionally delay sending

N_{A}

by time

T_{1}

to throttle requests. Candidate B accepts

N_{A}

upon receipt, relying on the cryptographic binding of

(N_{A}, N_{B})

to ensure freshness.

Phase 3 (Zero-Knowledge Proof, <3 s): Candidate B computes

Z_{B} = SHA 3 - 256

(K | | N_{A} | | N_{B})

using the shared secret K (actually, the password image

H_{P S K} = SHA 3 - 256 (K)

stored locally) and both nonces. The proof must arrive at validator A within Timer2 window (

T_{2} = 3000

ms). Validator A independently computes expected value

Z_{A}^{'} = SHA 3 - 256

(K | | N_{A} | | N_{B})

and accepts if

Z_{B} = Z_{A}^{'}

.

Timing Security Rationale:

Rate Limiting: Timer1 ( $T_{1}$ ) acts as an optional spam-deterrent throttle. While the protocol does not strictly require global clock synchronization protocols like NTP [31,32], approximate synchronization assists in predictable rate limiting. Cryptographic replay protection is guaranteed by the unique candidate nonce $N_{B}$ generated in Phase 1.
Freshness Guarantee: Timer2’s ( $T_{2}$ ) window ensures proofs are computed with current session nonces, not cached values. This is the critical security-relevant timeout.
Asymmetric Commitment: Candidate commits to $N_{B}$ before seeing validator’s $N_{A}$ , preventing adaptive choice of nonces.

The optional throttle

T_{1}

(when enabled) can be set to values like 10 s to rate-limit authentication attempts. However, for performance-critical deployments,

T_{1}

can be set to 0 (no throttling), relying entirely on the cryptographic freshness guarantees from unique nonces and the

T_{2}

timeout. Our experimental evaluation (Section 9) uses

T_{1} = 0

to measure raw protocol overhead, achieving median join latencies of 1.2–1.8 s.

Mutual Authentication: For bidirectional authentication, both peers simultaneously execute the protocol in opposite roles (each peer acts as both validator and candidate). This requires maintaining separate state machines for local validator state and local candidate state, with distinct nonce pairs and timers for each direction.

The security of this mutual authentication scheme relies on the security of the underlying protocol under concurrent composition. This composition is secure because each peer maintains a separate and independent state machine for its validator and candidate roles. The nonces generated for the instance where peer A authenticates B (e.g.,

N_{A}, N_{B}

) are distinct and independent from the nonces generated for the instance where B authenticates A (e.g.,

N_{A}^{'}, N_{B}^{'}

). This strict separation of state prevents an adversary from leveraging messages from one protocol instance to compromise the other, thus thwarting reflection or interleaving attacks.

Zero-Knowledge Property: The protocol reveals no information about K to eavesdroppers. Observed values are (1) random nonces

N_{A}, N_{B} \in {0, 1}^{256}

, and (2) hash output

Z_{B} = SHA 3 - 256 (K | | N_{A} | | N_{B})

. Under the random oracle model for SHA3-256, the output

Z_{B}

is computationally indistinguishable from a random string to any adversary not possessing K. The three-parameter construction (

K | | N_{A} | | N_{B}

) ensures that even if one nonce from a previous session were compromised, the adversary gains no advantage in the current session without knowing K.

Security of Three-Parameter Construction: The hash construction

Z = SHA 3 - 256

(K | | N_{A} | | N_{B})

using three concatenated parameters provides several critical security properties not achievable with simpler two-parameter constructions:

1.: Dual-Nonce Freshness: Including both $N_{A}$ (validator nonce) and $N_{B}$ (candidate nonce) ensures that the hash input is fresh from both participants’ perspectives. A two-parameter construction $Z = SHA 3 - 256 (K | | N_{A})$ would allow the candidate to precompute responses for multiple possible validator nonces before Timer 1 expires. Similarly, $Z = SHA 3 - 256 (K | | N_{B})$ would allow the validator to precompute challenges. The three-parameter construction forces real-time computation after both nonces are revealed.
2.: Reflection Attack Prevention: In concurrent composition (mutual authentication where both peers run validator and candidate roles simultaneously), the three-parameter construction prevents an adversary from reflecting messages from the $A \to B$ authentication instance to the $B \to A$ instance. Since each direction uses distinct nonce pairs $(N_{A}, N_{B})$ vs. $(N_{A}^{'}, N_{B}^{'})$ , proofs computed with one pair cannot be reused for the other pair (with probability $1 - 2^{- 256}$ ).
3.: Domain Separation: The inclusion of both nonces provides implicit domain separation between different protocol instances. An adversary who compromises one past session transcript $(N_{A}^{old}, N_{B}^{old}, Z^{old})$ cannot use any part of it in a new session with fresh nonces $(N_{A}^{new}, N_{B}^{new})$ because the hash input domain ${0, 1}^{768}$ (256 bits each for K, $N_{A}$ , $N_{B}$ ) ensures that $Z^{old}$ reveals nothing about $Z^{new}$ under the random oracle model.
4.: Concatenation Security: The concatenation $K | | N_{A} | | N_{B}$ uses fixed-length components (each 256 bits), preventing length-extension attacks or ambiguous parsing. Unlike variable-length constructions where $K = “ a b ”$ , $N_{A} = “ c ”$ could be confused with $K = “ a ”$ , $N_{A} = “ b c ”$ , the fixed-length constraint ensures unique parsing. SHA3-256 (Keccak) is additionally resistant to length-extension attacks by design.
5.: Key–Nonce Binding: The construction tightly binds the secret K to the session-specific nonces within a single hash invocation. This is stronger than schemes that separately hash components (e.g., $Z = Hash (K) \oplus Hash (N_{A} | | N_{B})$ ), which may leak partial information through algebraic structure. The monolithic hash $SHA 3 - 256 (K | | N_{A} | | N_{B})$ provides all-or-nothing security: the adversary learns nothing without K, and possession of K is both necessary and sufficient for proof generation.

These properties collectively ensure that the protocol achieves computational soundness (Theorem 1) with a concrete security bound

negl (λ) \leq 2^{- 256}

, even under concurrent composition and adversarial nonce selection.

Non-Transferability: Proofs cannot be reused across sessions because fresh random nonces are generated for each authentication attempt. No two executions produce identical transcripts (with probability

1 - 2^{- 256}

).

Offline Attack Resistance: The security proofs assume K is a high-entropy secret (e.g., uniformly random 256-bit key). If K is a low-entropy password chosen by users, the protocol becomes vulnerable to offline dictionary attacks: an adversary capturing a transcript

(N_{A}, N_{B}, Z)

can brute-force candidate passwords

K^{'}

by computing

Z^{'} = SHA 3 - 256 (K^{'} | | N_{A} | | N_{B})

and checking if

Z^{'} = Z

. For deployment with user-chosen secrets, we recommend deriving K from passwords using a key derivation function with high computational cost (e.g., Argon2, Scrypt) or requiring high-entropy keys generated by the system.

7.4. Formal Security Properties

We formalize the security guarantees of our protocol under the random oracle model for SHA3-256.

Theorem 1

(EUF–CMA Soundness). Let

MAC : {0, 1}^{*} \to {0, 1}^{n}

be a message authentication code that is existentially unforgeable under a chosen-message attack (EUF–CMA), instantiated as

{KMAC}_{256}

with context string ZKAuth-v1. Consider the three-move protocol where the candidate chooses

N_{B} \overset{$}{\leftarrow} {0, 1}^{256}

; the validator chooses

N_{A} \overset{$}{\leftarrow} {0, 1}^{256}

, the candidate sends

Z : = {MAC}_{K}

(ZKAuth-v1

‖ N_{A} ‖ N_{B}

), and the validator accepts if and only if its recomputation equals Z. Then, any PPT adversary

A

(playing the candidate) that causes the validator to accept with non-negligible probability

ϵ (λ)

yields an EUF–CMA forger against

MAC

with (essentially) the same advantage.

Proof.

Assume an adversary

A

that, without knowledge of K, convinces the validator with probability

ϵ

. We build a forger

F^{O_{K} (\cdot)}

in the EUF–CMA game.

F

runs

A

internally and perfectly simulates the validator: it samples

N_{A} \overset{$}{\leftarrow} {0, 1}^{256}

and receives

N_{B}

from

A

. When

A

outputs Z,

F

halts and outputs the purported forgery

(m^{⋆}, t^{⋆}) = (ZKAuth - v 1 ‖ N_{A} ‖ N_{B}, Z)

.

F

never queried its MAC oracle on

m^{⋆}

, so

(m^{⋆}, t^{⋆})

is a valid EUF–CMA forgery exactly when

A

would be accepted. Thus

Pr [F forges] = Pr [A is accepted] = ϵ

, contradicting EUF–CMA unless

ϵ

is negligible. □

Theorem 2

(Outsider Indistinguishability). Let

MAC

be a PRF-secure MAC and let transcripts be

τ = (N_{A}, N_{B}, Z)

with

N_{A}, N_{B} \overset{$}{\leftarrow} {0, 1}^{256}

and

Z = {MAC}_{K}

(ZKAuth-v1

‖ N_{A} ‖ N_{B}

). For any PPT distinguisher

D

that does not know K and has no oracle access to

{MAC}_{K}

, the advantage in distinguishing real transcripts from simulated ones

(N_{A}, N_{B}, U_{n})

with

U_{n} \overset{$}{\leftarrow} {0, 1}^{n}

is negligible in λ.

Proof

(Proof sketch). Hybrid H₀ is the real distribution using

{MAC}_{K}

. In PRF security, H₀ is computationally indistinguishable from H₁, where Z is computed by a truly random function F on the input

(ZKAuth - v 1, N_{A}, N_{B})

. In H₁, since each

(N_{A}, N_{B})

pair is fresh with a probability of

1 - 2^{- 256}

and never reused,

F (\cdot)

on fresh inputs is uniformly random and independent of

(N_{A}, N_{B})

. Therefore, H₁ is identical to the simulated distribution

(N_{A}, N_{B}, U_{n})

up to a negligible nonce-collision probability. Thus, any distinguisher for real vs. simulated yields a PRF distinguisher for

MAC

, contradicting its security. □

Corollary 1

(Outsider unlinkability). Let

τ_{1}, τ_{2}

be two independent transcripts generated with fresh nonces. For any PPT adversary without K, deciding whether

τ_{1}, τ_{2}

were produced under the same K or two independent keys has advantage at most negligible in λ. Proof. Reduce to Theorem 2 by replacing both Z values with uniform

U_{n}, U_{n}^{'}

; the resulting distribution hides any linkage apart from negligible nonce-collisions.

Theorem 3

(Designated Verifier Zero Knowledge/Non-Transferability). For any PPT distinguisher

D

that holds the verifier’s secret K (designated verifier setting) and any auxiliary input z, there exists a simulator

S (K; z)

that outputs transcripts identically distributed to real protocol executions. In particular,

S

samples

N_{A}, N_{B} \overset{$}{\leftarrow} {0, 1}^{256}

and sets

Z : = {MAC}_{K}

(ZKAuth-v1

‖ N_{A} ‖ N_{B}

); the distribution of

(N_{A}, N_{B}, Z)

is identical to the real distribution.

Proof.

Straight-line simulation with K is identical to the real protocol because the verifier’s computation in the real protocol is precisely

Z = {MAC}_{K} (ZKAuth - v 1 ‖ N_{A} ‖ N_{B})

on fresh nonces. Hence, the designated verifier can simulate transcripts perfectly; transcripts therefore carry no transferable evidence to third parties. □

Theorem 4

(Completeness). If both peers A and B possess the same pre-shared key K, follow the protocol honestly and communicate over a reliable, ordered channel (WebRTC SCTP data channel over DTLS) with round-trip latency

RTT < T_{2}

; then, both will successfully authenticate each other.

Proof.

Both parties compute

Z = {MAC}_{K} (ZKAuth - v 1 ‖ N_{A} ‖ N_{B})

deterministically; equality holds and the validator accepts. DTLS provides confidentiality and integrity; SCTP provides reliability and ordering. □

Theorem 5

(Liveness under bounded latency). Assume a bound

{RTT}_{max}

on round-trip latency of the signaling channel and timeout

T_{2} \geq 2 {RTT}_{max} + ϵ

for some slack

ϵ > 0

. Then, barring crash failures, the protocol completes within

T_{total} \leq 3 {RTT}_{max} + ϵ

and acceptance is not prevented by timeouts.

Proof sketch .

One RTT to deliver

N_{B}

, one RTT to deliver

N_{A}

, and one RTT to deliver Z suffice;

T_{2}

accommodates a full challenge–response phase. Timer

T_{1}

is an optional throttle (not security-critical); replay resistance comes from fresh nonces and

MAC

verification. □

7.5. Resistance to MITM

The protocol runs over an established WebRTC data channel (SCTP over DTLS). DTLS provides end-to-end encryption and authenticated integrity; SCTP provides reliable, ordered delivery. Since the DTLS handshake is authenticated by certificates exchanged in ICE, an attacker cannot intercept or modify challenges/responses without breaking DTLS encryption. Effectively, the attacker must be one of the two peers to inject messages, which reduces the threat to an attacker who does not know K, a case already covered by the soundness proof (Theorem 1).

7.6. Revocation Mechanisms

Current shared-secret scheme cannot individually revoke one user without rotating secret for everyone. For fine-grained revocation, we plan integration of dynamic accumulators or revocation lists: peers could require ZK proof “I have credential AND it is not in revoked set.” Known constructions (BLAC [33]—Blacklistable Anonymous Credentials, verifier-local revocation in group signatures) could be integrated.

Our system accommodates revocation logic by broadcasting control messages. A control:banned message could be sent about peer ID, causing all peers to drop that connection. A banned peer could try rejoining under a new ephemeral ID, but if the credential itself is revoked (secret changed), they fail authentication.

7.7. TLA+ Specification

We provide a TLA+ specification capturing the protocol state machine and key properties (Listing 1):

Listing 1. TLA+ specification of DV-ZK authentication (PSK-keyed MAC).

TLA+ Specification Analysis: The specification models the DV-ZK authentication protocol using a MAC-based construction. The

MAC

function is modeled as an uninterpreted constant operator. Key features include:

MAC-based Authentication: Uses $MAC (K, Ctx ‖ N_{A} ‖ N_{B})$ instead of an unkeyed hash;
Concrete Secret Assignment: Honest peers explicitly assigned K;
Non-deterministic Nonce Generation: Uses existential quantification ( $\exists n \in Nonces 256$ ) for proper randomness modeling;
Freshness Timeout $T_{2}$ : Enforces time bound for tag delivery (Timer1 removed as not security-critical);
Context String $Ctx$ : Provides domain separation for MAC inputs.

Safety Property (SafetyInvariant): Authenticated peers must have matching MAC tags computed with correct secrets and nonces, and authentication must occur within the

T_{2}

timeout window. This ensures only peers with K can generate valid tags.

Type and Freshness Invariants: The specification includes

TypeOK

to verify correct variable types and

FreshNonces

to ensure nonces are distinct within sessions.

Model checking with TLC on simplified scenarios confirms: (1) no adversary without K can authenticate (safety), (2) honest peers with matching K eventually authenticate (liveness), and (3) all authentications satisfy the

T_{2}

freshness constraint.

7.8. TLA+ Specification for Signaling Protocol

The signaling protocol (Section 6) involves complex state management for join/offer/answer exchanges with onion-routed message delivery. Below, we provide a TLA+ specification modeling the core protocol phases and invariants (Listing 2).

Listing 2. TLA+ specification ofWebRTC Swarm signaling with dedup/TTL.

Signaling Protocol TLA+ Specification Analysis: This specification models the core WebRTC Swarm signaling protocol phases described in Section 6. Key aspects include the following:

Global Time Modeling: CurrentTime variable and AdvanceTime action formally model time progress, enabling verification of rate limiting (e.g., no joins faster than five time units). The weak fairness condition $WF_vars (AdvanceTime)$ ensures time advances sufficiently for timing-dependent actions to complete.
Join Phase: SendJoin action models peer joining with rate limiting (5 s minimum between joins using global clock) and full node membership tracking with initial unauthenticated status.
Offer/Answer Exchange: SendOffer, SendAnswer, and ReceiveAnswer actions model the asymmetric SDP negotiation where new joiners always act as answerers to avoid glare problems.
Connection Establishment: EstablishConnection models ICE completion and transition to connected state.
Authentication Integration: AuthenticatePeer models the transition from unauthenticated to authenticated after the zero-knowledge protocol (specified separately) completes successfully.
Session Identity: Each negotiation has unique sessionID to prevent replay attacks and ensure stale messages do not disrupt current sessions.

Invariant 1 (Uniqueness): For any two honest peers p and q, at most one WebRTC negotiation is active between them at any time. This is enforced by state machine locking—connectionState[p][q] and connectionState[q][p] cannot both be in offerSent state simultaneously, and both peers cannot have active offers to each other.

Invariant 2 (Identity Uniqueness): A peer has exactly 0 or 1 identity in a swarm across all full nodes. If peer p appears in swarm

s w

at multiple full nodes, the membership status must be consistent (same authentication state).

Safety Property: Only authenticated peers can maintain connections. Any peer in connected state for a swarm must have memberStatus = authenticated at the relevant full node. This prevents unauthorized peers from participating in payload communication.

Liveness Property: If a peer initiates joining and there exist authenticated members in the swarm, eventually the joining peer will reach connected state with those members (assuming honest behavior and eventual message delivery).

Model checking with TLC on a bounded scenario (three peers, two full nodes, one swarm, bounded session IDs) confirmed: (1) Invariant 1 holds—no dual-offer glare conditions occurred, (2) Invariant 2 holds—peer identities remained consistent across nodes, (3) Safety holds—only authenticated peers reached connected state, and (4) Liveness holds—joining peers eventually connected to existing members within expected time bounds.

7.9. TLA+ Model Checking Verification Results

To validate the correctness and robustness of our formal specifications, we performed exhaustive model checking using the TLC model checker on both the DV-ZK authentication protocol and the WebRTC Swarm signaling protocol. Table 1 summarizes the comprehensive verification results. The full TLA+ source code used for this verification is available as Supplementary Materials.

Table 1. TLA+ model verification results.

Key Findings: The exhaustive model checking explored over 27 million total states across both specifications, verifying all critical safety and liveness properties without finding any violations. For DVZKAuth, the verification confirmed that (1) only peers with matching shared secrets can authenticate, (2) authentication sessions expire correctly after the T2 timeout window (3–5 s per specification), and (3) nonce freshness constraints prevent replay attacks even with finite nonce sets. For WebRTC Signaling, verification confirmed (4) the anti-glare property prevents dual-offer deadlocks between peer pairs, (5) message deduplication via unique message IDs prevents forwarding loops in the overlay network, and (6) peer membership status remains consistent across distributed full nodes.

The negligible fingerprint collision probabilities (<

10^{- 6}

) provide high confidence that TLC explored all reachable states within the bounded model. These results validate that our TLA+ specifications are not only executable but also formally proven correct under the specified constraints, giving strong assurance that the implemented protocols will behave safely in production environments.

8. Cryptoeconomic Incentives and Abuse Resistance

A distinguishing feature of WebRTC Swarms is integration of cryptoeconomic incentives via micro-deposit transmission protocols (MDTPs) and token-based resource management, building on prior work by Skowronski et al. on SPIDE (Sybil-Proof, Incentivized Data Exchange) [4].

The economic security of the system relies on the intrinsic value of the token. In the GRIDNET ecosystem, these tokens are interchangeable with the native base currency (GNC). This interchangeability ensures that tokens represent real-world computational cost. While the regulatory classification of utility tokens varies by jurisdiction, the protocol treats them strictly as computational credits required to prevent resource exhaustion.

8.1. Transmission Tokens and Token Pools

Each peer has an associated token pool—a budget of micropayment tokens allocated for network usage. Before sending a signaling message, the peer calls authenticateSDP(), which attempts to retrieve a token from the pool. If no token is available (pool empty), the authentication returns false and the message is not sent, enforcing a strict budget policy.

Because a token attached to every outgoing message, relays can be configured to require tokens as a forwarding condition. Full nodes check incoming messages: if a valid token is attached, accept and forward; if the token is missing/invalid, drop or delay. Each token might represent a tiny value (e.g., $0.0001). Forwarding earns the node that token. Nodes accumulate micro-earnings and eventually redeem or trade them.

8.2. Budget Policies

System allows flexible policies:

Fixed Rate Budget: Each user gets X tokens per hour, each message costs 1 token. This caps the message rate. For example, $X = 60$ tokens/hour allows a maximum of 1 message per minute on average without buying more, preventing flooding.
Micro-streaming: Instead of charging per message, charge per byte or per connection time. Maintain counter of data bytes forwarded and deduct tokens continuously.
Priority via Payment: Nodes might prioritize messages with higher token amounts (not explicitly implemented but architecturally plausible).

8.3. Relay Throttling and DoS Prevention

With tokens in play, a malicious user flooding network faces financial cost for each attempt. If an attacker has finite tokens, it cannot infinitely spam—once the pool depletes, getIsTokenPoolAvailable returns false and messages are not sent. For example, a naive attack of 1000 join requests per second at 1 token each requires 1000 tokens/second. Honest users typically send far fewer messages (normal join handshake: few dozen messages, costing few dozen tokens).

Relays can also throttle traffic by requiring more tokens for higher volumes, creating bandwidth market. If an attacker attempts a brute force attack by paying substantially, they incur real economic loss. Since relays earn tokens from forwarding, they are incentivized to carry traffic rather than drop arbitrarily (aside from security filtering), helping with censorship resistance.

8.4. Sybil Resistance

The primary defense against the large-scale Sybil attacks described in the threat model (Section 5.1), where an adversary attempts to control a significant fraction of the relay network, is cryptoeconomic. If joining the network as a full node requires staking a non-trivial amount of tokens, it becomes prohibitively expensive for an attacker to spawn enough malicious nodes to compromise the network’s anonymity guarantees. For instance, sustaining a flooding attack of 10,000 requests per second would deplete a standard user’s stake within minutes, requiring a linear financial investment that scales with the attack duration, making long-term disruption economically irrational compared to the cost of honest participation. Ultimately, the effectiveness of this protection is governed by decentralized consensus, as it depends on the value of unitary transmission tokens—parameters that can be adjusted through on-chain governance. Higher token valuations would render attacks prohibitively expensive even sooner, providing a tunable security parameter that the community can adjust in response to observed threat levels. For a comprehensive treatment of Sybil-proof data transmission mechanisms, including detailed formalization of transmission tokens and token pools, we refer readers to our prior work on SPIDE [4]. An attacker must split funds to create many nodes, weakening each or incurring high cost. Furthermore, a misbehaving Sybil node (e.g., one that drops messages despite being paid) can be detected and penalized, causing the attacker to lose their stake. This aligns economic incentives with honest network operation.

8.5. Economic Tuning

A network can adjust token prices and budgets via governance. The community could vote (using decentralized voting in GRIDNET) to change how many tokens joining costs or issue more tokens to active participants as rewards (mining-like). A dynamic pricing model could be used: if the network is under heavy load, the token price rises automatically, throttling usage and maintaining performance.

Since micropayments are per message/action, the system does not rely on heavy on-chain transactions (too slow/costly). The tokens might be off-chain ephemeral tokens or blockchain state channel microtransactions.

8.6. Performance Benchmarks

Empirical tests show the following results with token gating:

NAT traversal success: Approached 100% (full nodes act as TURN). Under heavy NAT-churn conditions, token incentives ensure enough nodes advertise TURN service (earning tokens for relaying media).
Join latency: Additional token verification overhead is minimal (hash check under 0.5 ms average). A typical join (with one full node hop) completes with ∼300 ms signaling overhead, with ICE dominating at 1–2 s median. Multi-hop circuits add ∼100 ms per hop—a reasonable tradeoff for decentralization.
Churn resilience: With micropayments, nodes remain as long as paid. Tests with 20% peers leaving/joining rapidly showed swarm continues functioning, average join latency did not degrade significantly (<10% increase), thanks to token gating preventing join signal overload.
Abuse under pricing pressure: Attack scenario: 50 peer processes attempting to flood swarm. Without tokens, could overwhelm honest peers. With tokens enforced, processes shared fixed pool; after few dozen messages each, ran dry and could not continue. Honest peer (with sufficient tokens) successfully joined after malicious ones throttled.

An extremely well-funded attacker (simulated with 10,000 tokens) could send 10k messages but dynamic pricing kicks in: after 1000 messages, each additional requires 2 tokens. Attacker’s effective messages are fewer as a result. The burst lasts 2–3 s causing temporary degradation (join latency 1.2 s → 2.5 s during burst), but cost significant and burst stops when tokens burned. The network can adjust costs via governance to mitigate prolonged impact.

In summary, MDTP-based microtransactions turn network resources into metered commodities, aligning usage with cost. Honest usage (moderate) costs little and can be offset by earning tokens when relaying for others. Malicious usage (excessive) becomes prohibitively expensive, filtering noise and ensuring capacity for legitimate signaling even under load.

9. Experimental Evaluation

We implemented the WebRTC Swarms protocol stack in GRIDNET Core (C++/ JavaScript hybrid environment) and conducted experiments to evaluate: (1) connectivity success rates in NAT-heavy environments, (2) join/leave latency and overhead, (3) behavior under churn, and (4) abuse resistance under adversarial load.

9.1. Testbed Setup

We deployed a network of 10 full nodes on geographically distributed cloud VMs (AWS and Azure regions) and up to 50 browser-based peers (automated using Puppeteer on Chromium-headless instances, with manual verification performed on Chrome 110 and Firefox 109) behind diverse NATs (home broadband, symmetric carrier-grade NAT via VPN simulation, enterprise NAT setups). All peers were configured with zero-knowledge shared secret authorization (32-byte key) and initial token pool of 1000 tokens each. Full nodes ran the default routing algorithm (flood for joins, unicast for others) with HOP_COUNT_LIMIT=100. ICE servers (STUN) were provided but no TURN was used; the swarm overlay is expected to act as a relay if needed.

9.2. NAT Traversal Success

Result: In 200 join attempts across varying NAT combinations, 197 succeeded in establishing direct P2P WebRTC connections (98.5% success). The remaining three cases all involved symmetric NAT pairs where direct ICE failed; swarm automatically fell back to using a full node as relay (offers exchanged; when direct ICE timed out, peers sent data through full node forwarding). Thus, from an application perspective, 100% of calls were completed—either via P2P or swarm relay.

Comparison: STUN-alone scenarios typically report ∼85% success [34]. Success approached 100% due to the full nodes falling back to acting as decentralized TURN relays when direct P2P failed. Additional latency from relaying through a full node is modest (∼30 ms average for one hop, measured in symmetric NAT case through nearby full node). These near-100% success rates empirically validate the liveness property in Theorem 5, where honest peers eventually authenticate and connect under realistic RTT < 1.5 s, with the decentralized relay overlay ensuring completeness even when direct ICE fails. This relay mechanism also empirically supports unlinkability (Theorem 3), as relayed connections prevent direct IP exposure, with transcripts remaining indistinguishable across sessions.

9.3. Join Latency

Median join latency (measured from joinSwarm() initiation to receiving last getOffer/Offer and sending all answers):

Swarm of 10 members: 1.2 s median;
Swarm of 20 members: 1.8 s median.

In summary, there was a delay of ∼150 ms per join message propagation through overlay (often two hops), ∼300 ms for all offers generated by peers (some delay due to offer SDP creation in browsers), and ∼750 ms for ICE exchanges to complete. In larger swarms (20 members), this is increased to 1.8 s largely because the new peer must process more offers and ICE (offers generation performed in parallel by different peers but has a limiting factor: new peers applying many SDP answers sequentially and gathering many ICE candidates).

Token authentication has an overhead consisting of attaching/verifying tokens under 0.5 ms average (negligible). The join process for 10 members involves ∼50 signaling messages (join, nine offer-requests, nine offers, nine answers, ∼20 ICE exchanges, few control messages) = ∼50 tokens spent/verified. There is no measurable slowdown due to this; CPU usage on nodes for token checks remained low (<1% single core). Note: For these latency experiments, the optional security throttle

T_{1}

was set to 0 s to measure the raw protocol overhead and ICE performance. The observed medians of 1.2–1.8 s represent the baseline performance without artificial rate-limiting delays.

9.4. Churn Resilience

A 30 min run in which peers randomly leave/rejoin the swarm was conducted. At any time, ∼15 peers exist in the swarm, but membership rotated, totaling 50 unique peers cycling in/out. Despite churn, the swarm maintained connectivity: full nodes updated the membership lists promptly (departing peer’s node broadcast control:swarmClosing, removing member). The average rejoin time for a peer leaving and rejoining after 60 s was 1.5 s (slightly higher than first-time joining due to re-establishing ICE).

No issues like ghost entries or duplicate identities were encountered thanks to rules preventing stale entries and the use of unique ephemeral IDs per join attempt (seqNr ensured rejoin was not mistaken for a duplicate old join).

A partition scenario was evaluated, splitting the full node network (simulated by firewalling some nodes). The overlay routed within each partition but join messages did not cross partitions. When the partition healed, the pending join info eventually exchanged (overlay flood resumed). No inconsistency occurred within the connected components.

9.5. Abuse Resistance Under Load

A stress test emulating a DoS attack was evaluated: five malicious peers continuously sending dummy join requests (without completing ZK proof) and rapidly sending many ICE candidates with bogus data. Without defenses, this could consume the bandwidth/CPU usage of full nodes and annoy peers with bogus signaling. However, each malicious peer had only 100 tokens and each join/ICE message consumed 1 token. Thus, each malicious peer could send a maximum of 100 messages in their attack.

It was observed that attackers exhausted their token pools in under 5 s (trying 20 msgs/s spamming). In those 5 s, honest peers did see some join attempts from malicious IDs—but since never completed ZK auth, they remained “unauthorized” and did not disrupt real data (no media sent to them). Full nodes forwarded few attempts initially (earning tokens), but as soon as tokens ran out, no more spam arrived. The network easily absorbed the attack; CPU usage on full nodes peaked at ∼20% handling flurry, then dropped.

Important result: no honest peer’s service significantly affected. Ongoing calls between honest peers continued normally—some reported a slight glitch (sub-second) when the flood of ICE packets came, but due to the separate signaling channel and the fact that browsers ignore unknown/unmatched ICE, media quality was not degraded.

An extremely well-funded attacker scenario (simulated by giving attacker 10,000 tokens) was conducted in which the attacker could send 10k messages. Dynamic pricing configured: after 1000 messages, each additional required two tokens. Attacker’s effective messages were fewer. All tokens were spent after causing a 2–3 s burst of ∼5000 messages. Full nodes handled it, though join latency for new honest peers increased from ∼1.2 s to ∼2.5 s during the burst (the nodes were busy processing spam). However, because of token burn, the burst stopped. This demonstrates that an extremely rich attacker could cause short degradation, but at a significant cost. The network can manually or via governance raise the cost per message in reaction to the short duration. Token exhaustion limiting attacks to 5–10 s aligns with the soundness bound in Theorem 1 (negl(

λ

) success for adversaries without K), as malicious peers without credentials fail ZK proofs, and economic costs prevent unbounded replays or floods.

9.6. Quality of Service

A non-attack scenario was conducted, measuring media latency and throughput when relaying through the swarm vs direct. In direct P2P, the data channel ping averaged 60 ms (between two distant peers). Through one relay (path peer → relay → peer), the latency was ∼80 ms with an overhead of 20 ms. With two relays, this was ∼95 ms. The throughput for the relayed data channel was about 90% of the direct throughput (an extra TCP hop, but full nodes forward at the line rate with minimal buffering). The video quality when relayed was stable, though the full node’s uplink capacity becomes a factor.

10. Conclusions and Future Work

We presented WebRTC Swarms, a novel architecture for decentralized, privacy-preserving signaling and group authorization in peer-to-peer networks. By integrating onion-routed relay circuits with zero-knowledge membership proofs and cryptoeconomic incentives, our design achieves a synergy between privacy, security, and sustainability that is greater than the sum of its parts. Users discover and connect to peers without exposing identities or IP addresses, relying on overlay of incentivized full nodes to carry signaling traffic. Through lightweight zero-knowledge protocol, peers anonymously prove their right to join sessions, ensuring that only authorized members participate. The transmission token mechanism ensures that the network resists abuse: participants “pay their way” in micro amounts, funding infrastructure and deterring spammers/Sybils.

Rigorous analysis—including examination of routeSDPEntities() and process SignalEntity() logic—confirms that SDP and ICE messages were handled securely and efficiently. Each message was validated, signed/token-tagged, and routed with appropriate checks (SDP session ID matching prevents replay). We formally specified protocol state machines and proved key safety and liveness properties, giving high confidence in correctness.

Empirical results demonstrate that the system performs well under realistic conditions: NAT traversal success essentially 100%, join latencies low (1–2 s for swarms of dozens of peers), network withstands adversarial attacks by leveraging economic disincentives of token system. Even under token-budget exhaustion attacks, honest peers’ experiences were only marginally affected and quickly recovered, indicating strong resilience.

Future Directions

WebRTC Swarms open several avenues for future work:

Advanced Cryptography: Explore incorporating more advanced anonymous credential schemes allowing fine-grained revocation and delegation (e.g., time-limited access tokens for guests). Using verifiable random functions or accumulator proofs could eliminate the need to restart swarm when a member is removed by enabling verifier-local revocation checks.

Scalability: While the current flooding approach works for moderately sized swarms, larger networks might benefit from multicast or spanning tree signaling topology. This could leverage token incentives to dynamically construct efficient overlay topologies (pay certain nodes to act as multicast branch points). Ensuring anonymity in optimized routing is an interesting challenge—perhaps intersecting with obfuscation techniques to hide multicast patterns.

Governance: WebRTC Swarms come with built-in “hooks” for community control. Token parameters, access policies, and software updates can be decided through decentralized governance (GRIDNET OS supports on-chain voting). The community could vote to adjust token price if it is deemed too high/low and implement countermeasures against abusive behavior (blacklisting public keys, tightening proof difficulty). Such adaptability is crucial in a system that is intended to be long-lived and free of centralized oversight.

Deployment: the most profound implication of this study is the empowerment of users to create own private, self-governing communication networks. WebRTC Swarm is essentially a microcosm of decentralized internet service: it has its own membership with access controlled by community-defined secrets enforced through ZK proofs, own “economy” (via tokens), its own infrastructure (full nodes provided by community), and it operates without trusting a central server or revealing metadata to an outside party. Users can spin up swarm for particular community/purpose (activist organization, distributed IoT network coordinating), confident joiners are bona fide (authorized by group’s policy) yet anonymous to each other except pseudonyms, and the network will scale and sustain itself through built-in incentives.

In conclusion, WebRTC Swarms represent a step toward truly decentralized real-time communication that does not compromise on privacy or accountability. By unifying signaling anonymity, group authorization, and incentive-aligned governance, our approach lays the foundation for communication systems where users retain full control of identity, data, and network rules. This aligns closely with ethos of GRIDNET OS and similar platforms striving to redefine online interaction: not through siloed applications controlled by single entities, but through open protocols owned and operated by participants. We believe that WebRTC Swarms can serve as privacy-first signaling substrate for wide range of applications, from peer-to-peer conferencing and messaging to collaborative VPNs or edge networks, all with built-in governance hooks letting communities decide how the network should run.

The realization of WebRTC Swarms in a production environment demonstrates the possibility to achieve high performance and robustness without surrendering privacy or requiring centralized mediation. As the internet community grapples with issues of trust, surveillance, and platform power, we offer this work as a blueprint for alternative models—one where decentralization and privacy are not just theoretical ideals, but practical, working features of everyday communication tools.

Supplementary Materials

The following supporting information can be downloaded at: https://github.com/GRIDNETOS/GRIDNETOS/tree/main/Research/WebRTC%20Swarms/tlaplus (accessed on 18 December 2025). The repository contains the complete TLA+ specifications used for the formal verification of the protocols presented in this study.

Funding

This research received no external funding.

Data Availability Statement

The TLA+ specifications and verification results presented in this study are available in the paper. The implementation is available in the GRIDNET OS repository. Available online: https://github.com/GRIDNETOS/GRIDNETOS (accessed on 18 December 2025).

Acknowledgments

The author thanks the GRIDNET community for their contributions to the open-source implementation and valuable feedback during development.

Conflicts of Interest

The author declares no conflicts of interest.

References

Bergkvist, A.; Burnett, D.C.; Narayanan, A.; Aboba, B.; Brandstetter, T. WebRTC 1.0: Real-Time Communication Between Browsers. W3C Recommendation, January 2021. Available online: https://www.w3.org/TR/webrtc/ (accessed on 18 December 2025).
Jennings, C.; Uberti, J.; Rescorla, E. Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal. RFC 8445, July 2018. Available online: https://datatracker.ietf.org/doc/html/rfc8445 (accessed on 18 December 2025).
Skowronski, R. UI dApps Meet Decentralized Operating Systems: A Practical Guide for Developing Decentralized User Interfaces. Electronics 2022, 11, 3004. [Google Scholar] [CrossRef]
Skowronski, R.; Brzeziński, J. SPIDE: A Sybil-proof incentivized decentralized exchange protocol. Clust. Comput. 2022, 25, 2241–2270. [Google Scholar] [CrossRef]
Lamport, L. Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers; Addison-Wesley: Boston, MA, USA, 2002. [Google Scholar]
Matrix: An Open Network for Secure, Decentralized Communication. Available online: https://matrix.org (accessed on 18 December 2025).
Jami: Free and Universal Communication Platform. Available online: https://jami.net (accessed on 18 December 2025).
Tegeler, F.; Koll, D.; Fu, X. WebDHT: Browser-compatible distributed hash table for P2P web applications. In Proceedings of the IEEE Network Computing and Applications (NCA), Boston, MA, USA, 14–16 December 2022; pp. 1–8. [Google Scholar] [CrossRef]
Hyperswarm: A Distributed Networking Stack for Connecting Peers. Available online: https://github.com/hyperswarm (accessed on 18 December 2025).
libp2p: A Modular Network Stack. Available online: https://libp2p.io (accessed on 18 December 2025).
Dingledine, R.; Mathewson, N.; Syverson, P. Tor: The second-generation onion router. In Proceedings of the 13th USENIX Security Symposium, San Diego, CA, USA, 9–13 August 2004; pp. 303–320. [Google Scholar]
Dingledine, R.; Mathewson, N.; Syverson, P. Deploying low-latency anonymity: Design challenges and social factors. IEEE Secur. Priv. 2007, 5, 83–87. [Google Scholar] [CrossRef]
Reed, M.G.; Syverson, P.F.; Goldschlag, D.M. Anonymous connections and onion routing. IEEE J. Sel. Areas Commun. 1998, 16, 482–494. [Google Scholar] [CrossRef]
Murdoch, S.J.; Danezis, G. Low-cost traffic analysis of Tor. In Proceedings of the IEEE Symposium on Security and Privacy, Oakland, CA, USA, 8–11 May 2005; pp. 183–195. [Google Scholar] [CrossRef]
Chaum, D.; van Heyst, E. Group signatures. In Proceedings of the Advances in Cryptology, Proceedings of the EUROCRYPT’91, Brighton, UK, 8–11 April 1991; Springer: Berlin/Heidelberg, Germany, 1991; pp. 257–265. [Google Scholar] [CrossRef]
Boneh, D.; Boyen, X.; Shacham, H. Short group signatures. In Advances in Cryptology, Proceedings of the CRYPTO 2004, Santa Barbara, CA, USA, 15–19 August 2004; Springer: Berlin/Heidelberg, Germany, 2004; pp. 41–55. [Google Scholar] [CrossRef]
Camenisch, J.; Van Herreweghen, E. Design and implementation of the idemix anonymous credential system. In Proceedings of the 9th ACM Conference on Computer and Communications Security, Washington, DC, USA, 18–22 November 2002; pp. 21–30. [Google Scholar] [CrossRef]
Paquin, C.; Zaverucha, G. U-Prove Cryptographic Specification v1.1 (Revision 3); Technical Report; Microsoft Corporation: Redmond, WA, USA, 2013. [Google Scholar]
Corrigan-Gibbs, H.; Ford, B. Dissent: Accountable anonymous group messaging. In Proceedings of the 17th ACM Conference on Computer and Communications Security (CCS), Chicago, IL, USA, 4–8 October 2010; pp. 340–350. [Google Scholar] [CrossRef]
Corrigan-Gibbs, H.; Wolinsky, D.I.; Ford, B. Proactively accountable anonymous messaging in Verdict. In Proceedings of the 22nd USENIX Security Symposium, Washington, DC, USA, 14–16 August 2013; pp. 147–162. [Google Scholar]
Ghosh, M.; Richardson, M.; Ford, B.; Jansen, R. A TorPath to TorCoin: Proof-of-bandwidth altcoins for compensating relays. In Proceedings of the 7th Workshop on Hot Topics in Privacy Enhancing Technologies, Amsterdam, The Netherlands, 18 July 2014. [Google Scholar]
Amphora: Incentivizing Mixnet Nodes Through Probabilistic Micropayments; Technical Report; Constellation Network: San Francisco, CA, USA, 2021.
Merkle, R.C. Protocols for public key cryptosystems. In Proceedings of the IEEE Symposium on Security and Privacy, Oakland, CA, USA, 14–16 April 1980; pp. 122–133. [Google Scholar] [CrossRef]
Rivest, R.L.; Shamir, A. PayWord and MicroMint: Two simple micropayment schemes. In Security Protocols; Springer: Berlin/Heidelberg, Germany, 1997; pp. 69–87. [Google Scholar] [CrossRef]
Back, A. Hashcash—A Denial of Service Counter-Measure. 2002. Available online: http://www.hashcash.org/papers/hashcash.pdf (accessed on 18 December 2025).
Wood, G. Ethereum: A Secure Decentralised Generalised Transaction Ledger. Available online: https://bravenewcoin.com/assets/Whitepapers/Ethereum-A-Secure-Decentralised-Generalised-Transaction-Ledger-Yellow-Paper.pdf (accessed on 18 December 2025).
Buterin, V.; Conner, E.; Dudley, R.; Slipper, M.; Norden, I.; Bakhta, A. EIP-1559: Fee Market Change for ETH 1.0 Chain. Available online: https://eips.ethereum.org/EIPS/eip-1559 (accessed on 18 December 2025).
Pointcheval, D.; Stern, J. Security arguments for digital signatures and blind signatures. J. Cryptol. 2000, 13, 361–396. [Google Scholar] [CrossRef]
Petit-Huguenin, M.; Salgueiro, G.; Rosenberg, J.; Wing, D.; Mahy, R.; Matthews, P. Session Traversal Utilities for NAT (STUN). RFC 8489, February 2020. Available online: https://www.rfc-editor.org/info/rfc8489 (accessed on 18 December 2025).
Reddy, T.; Johnston, A.; Matthews, P.; Rosenberg, J. Traversal Using Relays around NAT (TURN): Relay Extensions to Session Traversal Utilities for NAT (STUN). RFC 8656, February 2020. Available online: https://www.rfc-editor.org/info/rfc8656 (accessed on 18 December 2025).
Mills, D.; Martin, J.; Burbank, J.; Kasch, W. Network Time Protocol Version 4: Protocol and Algorithms Specification. RFC 5905, June 2010. Available online: https://www.rfc-editor.org/info/rfc5905 (accessed on 18 December 2025).
Mills, D.L. Internet time synchronization: The Network Time Protocol. IEEE Trans. Commun. 1991, 39, 1482–1493. [Google Scholar] [CrossRef]
Tsang, P.P.; Au, M.H.; Kapadia, A.; Smith, S.W. Blacklistable anonymous credentials: Blocking misbehaving users without TTPs. In Proceedings of the ACM Conference on Computer and Communications Security, Alexandria VA, USA, 29 October–2 November 2007; pp. 72–81. [Google Scholar] [CrossRef]
Guha, S.; Francis, P. Characterization and measurement of TCP traversal through NATs and firewalls. In Proceedings of the Internet Measurement Conference (IMC), Berkeley, CA, USA, 19–21 October 2005; p. 18. [Google Scholar] [CrossRef]

Figure 1. Simplified topology showing peers connecting through full nodes that form an onion-routed overlay for signaling. Multi-hop circuits (e.g., X → Z → Y) prevent any single node from linking sender to receiver.

Table 1. TLA+ model verification results.

Metric	DVZKAuth	WebRTC Signaling
Configuration
Peers	2	2
Full Nodes	—	1
Swarms	—	1
Nonce Set Size	2	—
Message IDs	—	6
Time Bound (T2)	5	10
State Space Exploration
Total States Generated	21,925,567	5,463,130
Distinct States Found	4,282,470	858,435
Maximum Search Depth	22	14
Average Outdegree	1 (max: 9)	1 (max: 14)
Performance
Verification Time	5 m 42 s	1 m 35 s
States/Second	∼64,000	∼57,000
Verification Results
Invariant Violations	0	0
Deadlocks Detected	0	0
TypeOK Violations	0	0
Safety Property Violations	0	0
Verified Properties
TypeOK	✓	✓
FreshNonces	✓	—
SafetyInvariant	✓	—
T2 Timeout Constraint	✓	—
Session Expiry	✓	—
Anti-Glare (Invariant1)	—	✓
Identity Consistency (Inv2)	—	✓
Message Deduplication	—	✓
TTL Non-Negativity	—	✓
Fingerprint Collision Prob.	$2.7 \times 10^{- 7}$	$1.7 \times 10^{- 7}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.