5.1. Security Assumptions
We adopt the Dolev–Yao adversarial model to formalize the threat environment under which the proposed protocol operates. In this model, the adversary is considered a probabilistic polynomial-time (PPT) entity with the following capabilities, grouped into three main categories relevant to cryptographic protocol analysis:
Network control capabilities: The adversary has full control over the communication channel between the Modbus client and the server. can intercept, block, replay, reorder, delay, or modify any transmitted messages. This grants the ability to mount man-in-the-middle (MITM) attacks and message tampering.
Active protocol interaction: The adversary may impersonate a client or server without access to their long-term secrets. can initiate concurrent sessions, interleave or reflect messages between sessions, and craft arbitrary protocol-compliant or malformed messages in an attempt to deceive the other party. However, cannot forge authentication tags or decrypt ciphertexts without access to the appropriate keys.
Passive cryptographic analysis: can observe all public parameters, exchanged ciphertexts, nonces, and identifiers. Nevertheless, the adversary cannot break the underlying cryptographic primitives, such as HMAC or authenticated encryption with associated data (AEAD), which are assumed to be secure under standard assumptions. That is, cryptographic primitives behave ideally (e.g., HMAC is UF-CMA-secure, encryption schemes provide IND-CCA security), and no side-channel leakage occurs.
Despite these capabilities, the adversary is assumed to be computationally bounded and therefore cannot break cryptographic primitives. Specifically,
cannot recover the pre-shared key
or the derived session key
unless it is explicitly leaked, such as key compromise through side-channel attacks, weak storage protections, or inadvertent transmission via unsecured channels. While our protocol assumes secure key provisioning prior to deployment, we acknowledge that, in practice, key leakage is a real threat. To address this, the scheme supports periodic rekeying and session key regeneration through fresh nonces during each session establishment. Additionally, the damage of long-term key exposure is limited by the use of ephemeral session keys, which are never reused and are not stored persistently. It is also assumed that
cannot forge valid MACs or authentication tags without the corresponding secret keys. The cryptographic algorithms used—namely ASCON-128 for encryption and ASCON-Hash for key derivation—are assumed to be secure, providing resistance against cryptanalytic attacks such as forgery, collision, and differential analysis. Notably, ASCON-128 was selected by the NIST Lightweight Cryptography standardization process, and has been shown to offer strong security with low computational overhead in constrained environments such as embedded systems and industrial control networks [
27,
28,
29,
30]. This makes it particularly suitable for Modbus-based IIoT applications. Additionally, all nonces used in the protocol, including
and
, are assumed to be generated using a cryptographically secure pseudorandom number generator and are unpredictable to the adversary.
The security goals of the adversary include compromising the confidentiality of Modbus payloads, forging or modifying protocol messages without detection, impersonating legitimate clients or servers, replaying previously captured valid messages, and attempting to recover long-term or session-specific keys. Under these assumptions, we demonstrate in the subsequent analysis that our protocol achieves mutual authentication, confidentiality, integrity, and replay protection even in the presence of an active adversary with full network control.
5.2. Initialization Phase Security Analysis
Theorem 1. Mutual Authentication:
Under the assumption that HMAC is existentially unforgeable under chosen-message attacks (UF-CMA), and the nonces and are chosen uniformly at random and never reused, the mutual authentication protocol ensures the following:
- (i)
A Modbus server can verify that it is communicating with the legitimate Modbus client who knows the pre-shared key .
- (ii)
A Modbus client can verify that it is communicating with the legitimate Modbus server who knows .
Proof. We prove mutual authentication from both directions.
- (i)
Server authenticates client: Given that the pre-shared key is only known to legitimate participants, and assuming the HMAC construction based on ASCON_MAC is secure against existential forgery under chosen-message attacks (UF-CMA), an adversary cannot forge a valid MAC on a fresh pair . Additionally, since the nonce is randomly generated per session and never reused, the server can reject any replayed or pre-computed . Upon receiving a valid , the server is assured that the request originated from a party who knows , thus authenticating the client.
- (ii)
Client authenticates server: Upon receiving , the client verifies the using the pre-shared key. As before, if is unknown to the adversary, then constructing a valid is computationally infeasible under UF-CMA. Moreover, because the MAC binds to the original client-generated , the client can confirm that the server has received and processed its request, thereby preventing reflection or de-synchronization attacks. The freshness of also ensures replay resistance.
□
Conclusion. Since both authentication steps rely on cryptographically secure HMACs over unique session nonces and secret key material, and these cannot be forged or reused by an adversary under the stated assumptions, the protocol guarantees mutual authentication between the client and server. Therefore, an adversary controlling the communication channel cannot impersonate either party or interfere undetectably in the handshake process.
Formal Verification Result: We encoded the mutual authentication logic using ProVerif by defining the events
client_authenticated and
server_authenticated. As illustrated in
Figure 4, all related security queries were successfully verified as
true. These results provide formal assurance that the protocol achieves mutual authentication, even in the presence of a Dolev–Yao adversary.
Theorem 2. Session Key Secrecy:
Assuming the hash function used in the session key derivation function is modeled as a random oracle, and that the pre-shared key is kept secret, the session key SK is indistinguishable from a random value to any probabilistic polynomial-time (PPT) adversary who does not possess , even with full control over the network.
Proof. We prove that the session key established between the Modbus client and server remains secret from any adversary under the standard Dolev–Yao threat model, augmented with the assumption that the hash function behaves as a random oracle.
The session key is derived from the shared key and two freshly generated, uniformly random nonces and . These nonces are included in the plaintext authentication messages exchanged during the initialization phase.
An adversary who does not know cannot forge valid or predict future nonces, and thus cannot insert itself into the key derivation process. Moreover, since the nonces are never reused, each session produces a unique key, even if the same is reused.
The key derivation function is modeled in the random oracle setting, meaning that unless is known, the input to the hash function is computationally hidden from the adversary. Consequently, the adversary cannot distinguish the resulting from a uniformly random 128-bit string.
Suppose by contradiction that an adversary can distinguish from a random string with non-negligible advantage. Then, must have queried the random oracle on the correct tuple , which requires knowledge of . However, without forging a valid MAC (which is infeasible under the UF-CMA assumption of HMAC), cannot learn whether a guessed key is correct. Thus, we reach a contradiction. □
Conclusion: Under the assumptions that the hash function used in session key derivation is modeled as a random oracle, the HMAC construction is existentially unforgeable under chosen-message attacks (UF-CMA), the nonces and are freshly generated and never reused, and the pre-shared key remains secret and known only to the legitimate client and server, we conclude that no probabilistic polynomial-time (PPT) adversary can compute or distinguish the derived session key from a random value. Consequently, the session key secrecy property is preserved against active and passive adversaries under the Dolev–Yao model.
Formal Verification Result: We modeled the secrecy of the session key
SK in ProVerif using the query
query attacker(SK). ProVerif returned
false, indicating that the adversary cannot derive or guess the session key. This formally confirms that the session key secrecy property is preserved under the Dolev–Yao adversary model, as illustrated in
Figure 5.
5.3. Secure Communication Phase Security Analysis
Once the session key has been securely established, the Modbus client begins transmitting encrypted application-layer messages using the authenticated encryption algorithm ASCON-128, encapsulated within a COSE_Encrypt0 structure. In this section, we analyze the confidentiality, integrity, and authenticity guarantees of this message transmission stage.
Theorem 3. Message Confidentiality:
Under the assumption that the ASCON-128 authenticated encryption scheme is IND-CCA-secure, and the session key SK remains secret, the secure Modbus transmission phase guarantees the confidentiality of the Modbus payload against any PPT adversary.
Proof. We assume a Dolev–Yao adversary who has full control over the communication channel: they can intercept, replay, modify, inject, or block any message exchanged between the client and server. However, the adversary does not know the session key , which is derived securely during the initialization phase as proven previously.
In the secure transmission phase, the Modbus client encrypts the plaintext payload M using the ASCON-128 authenticated encryption algorithm. The encryption input includes the plaintext message M, the session key , and optional associated data such as the protected header and unprotected header. The ASCON scheme outputs the encrypted message and an authentication tag , which are combined and encoded into the ciphertext field of the COSE_Encrypt0 structure.
Given that ASCON-128 is IND-CCA-secure, any adversary who does not possess the correct session key cannot distinguish between the encryption of two chosen messages nor decrypt any message encrypted under that key. Moreover, attempts to tamper with the ciphertext will be detected due to the built-in authentication tag, and the message will be rejected upon decryption. □
Conclusion: Under the assumptions that ASCON-128 provides IND-CCA security, and that the session key
is known only to the legitimate Modbus client and server, it follows that the ciphertext exchanged during the secure transmission phase leaks no information about the underlying Modbus payload to any PPT adversary. Therefore, the confidentiality of the Modbus message is preserved.
Formal Verification Result: ProVerif confirmed the correspondence between the encrypted message output by the client and the ciphertext accepted by the server. Specifically, the query returned
true, indicating that only ciphertexts generated by an authenticated sender are successfully verified and accepted by the receiver. This formally confirms the integrity and authenticity of message delivery under an active adversary, as shown in
Figure 6.
Theorem 4. Message Integrity and Authenticity:
Under the assumption that ASCON-128 is an authenticated encryption scheme with integrity of ciphertexts (INT-CTXT), and the session key is known only to the legitimate participants, any modification of the ciphertext or forgery of a valid message by an adversary will be detected and rejected with overwhelming probability.
Proof. We consider a Dolev–Yao adversary who has full access to the communication channel and can intercept, inject, replay, or modify messages. However, the adversary does not know the session key , which was securely established during the initialization phase and proven to remain secret.
In the secure communication phase, the Modbus client encrypts each payload using the ASCON-128 authenticated encryption scheme. Upon receiving a message, the Modbus server uses the shared session key and corresponding nonce to attempt decryption. The decryption function verifies the tag internally. If any part of the ciphertext or associated data has been altered, or if the ciphertext was not generated by someone possessing , the decryption will fail and the message will be rejected.
The INT-CTXT property of ASCON-128 ensures that it is computationally infeasible for any PPT adversary to generate a valid ciphertext–tag pair without knowing . This directly provides strong guarantees for message integrity. In addition, since only the legitimate sender can compute a valid tag over the message and associated context, successful verification at the receiver implies that the message originated from the party who shares —thereby establishing authenticity.
Moreover, because the tag computation includes associated data (such as protected headers), even tampering with protocol metadata or session context will result in decryption failure, ensuring contextual integrity as well. □
Conclusion: Given that ASCON-128 is INT-CTXT-secure, and that is known only to the legitimate parties, we conclude that the integrity and authenticity of the transmitted messages are guaranteed. Any attempt by an adversary to modify or forge messages will result in authentication failure and message rejection.
Formal Verification Result: The same correspondence assertion between message sent and message accepted also ensures message authenticity and integrity. ProVerif confirmed that no forged ciphertext can be accepted unless it was sent by an authenticated party.
Theorem 5. Replay Protection:
Under the assumption that iv and sequence numbers are never reused within the same session and are strictly monotonic across messages for each sender identity, the protocol guarantees replay protection in the secure transmission phase.
Proof. In each session, the Modbus client includes a unique iv and a strictly increasing sequence number in the unprotected header of each COSE_Encrypt0 message. The issuer identifier specifies the identity of the message originator, allowing the server to isolate and track sequence numbers on a per-client basis.
On the server side, for each known , a per-client state is maintained, recording the latest accepted sequence number or a sliding replay window. Upon receiving a new message, the server performs the following checks: (i) whether the value for the given has already been processed, or (ii) whether it is smaller than the expected next sequence number. If either case holds, the message is rejected as a replay.
Furthermore, because the , , and are bound to the authenticated ciphertext via the authenticated encryption scheme ASCON-128 (typically as associated data), any alteration or reuse results in authentication tag failure. This ensures that replayed ciphertext, even if bit-wise-identical, is detected based on prior sequence use or contextual mismatch during tag verification. An adversary operating under the Dolev–Yao model, capable of intercepting and resending prior messages, cannot bypass these defenses without also forging a valid tag under a new context. Given the security of ASCON-128 under the INT-CTXT and nonce-respecting assumptions, such forgeries are computationally infeasible. □
Conclusion: By embedding unique nonces and strictly increasing sequence numbers tied to the sender identity, and by enforcing per-sender replay validation and authenticated decryption, the protocol achieves strong replay protection. This guarantee holds against any PPT adversary in the Dolev–Yao model, provided that nonces and sequence numbers are used correctly and without reuse. Our current design assumes strictly increasing sequence numbers to protect against replay attacks. However, in practical environments, devices may experience unexpected reboots or desynchronization due to power loss, and networks may exhibit delays or packet reordering. To address these issues, sequence number windows (e.g., sliding window protocols) can be employed to tolerate minor reordering, and devices can securely persist their counters across reboots using non-volatile memory or secure elements. Alternatively, incorporating lightweight challenge–response mechanisms or timestamps can enhance robustness against such threats in unstable network conditions.