ESP32-Based Hardware Key for Software Application Protection

Popovici, Alexandru-Ion; Anton, Florin-Daniel

doi:10.3390/app16094251

Open AccessArticle

ESP32-Based Hardware Key for Software Application Protection

by

Alexandru-Ion Popovici

and

Florin-Daniel Anton

^*

Department of Automatic Control and Industrial Informatics, Faculty of Automatic Control and Computers, National University of Science and Technology POLITEHNICA Bucharest, RO060042 Bucharest, Romania

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(9), 4251; https://doi.org/10.3390/app16094251

Submission received: 24 March 2026 / Revised: 23 April 2026 / Accepted: 24 April 2026 / Published: 27 April 2026

(This article belongs to the Topic Addressing Security Issues Related to Modern Software)

Download

Browse Figures

Versions Notes

Featured Application

Low-cost hardware key designed to protect licensed software applications in scenarios exposed to reverse engineering, emulation, and replay attacks by moving sensitive decisions and cryptographic operations to an embedded dongle.

Abstract

In the current context, classic software licensing and protection mechanisms based exclusively on host application checks can be circumvented by patching, emulation and replay attacks in user-controlled environments. This paper presents an adaptive hardware key implemented on the ESP32-S3 platform, which externalizes sensitive decisions and cryptographic operations from the host application to a dedicated device. The solution combines a device-anchored root of trust (secure boot and flash memory encryption), a PKI-verifiable identity (Public Key Infrastructure X.509 certificate and digital signatures as proof of ownership), hierarchical key derivation to avoid static secrets and the establishment of an authenticated encrypted session for all essential data exchanges. User access is conditioned by three-factor authentication (PIN—Personal Identification Number, TOTP—Time based One Time Password and USB physical presence) and a “code-in-dongle” mechanism, in which the important logic runs on the device and the application receives tokens with limited duration. Experimental validation demonstrates correct provisioning, secure session establishment, negative brute-force testing, as well as lifecycle support via signed OTA (Over-The-Air) with anti-rollback and encrypted backup/recovery. Build reports indicate a balanced flash distribution and available DIRAM (Data/Instruction RAM) margin, while IRAM (Instruction RAM) saturation (99.99%) reflects a normal architectural behavior of the ESP32-S3 unified memory model rather than a capacity constraint.

Keywords:

hardware dongle; software protection; license enforcement; ESP32-S3; secure boot; flash encryption; public key infrastructure; multi-factor authentication; AES-GCM; anomaly detection

1. Introduction

In recent years, the way software applications have been distributed and economically exploited has taken shape around two major directions: the expansion of licensing models (subscriptions, per-device/per-user licenses) and the migration toward cloud and IoT ecosystems. In parallel, in a digital landscape increasingly exposed to cybersecurity risks, protecting software intellectual property is becoming a priority, especially in industries that rely on licensed applications, industrial solutions, or proprietary algorithms. In this area, software piracy, license cloning, and physical attacks on hardware protection devices continue to generate significant economic losses worldwide. In response to these challenges, a frequently used category of solutions is represented by hardware dongles (physical keys), which implement authentication mechanisms and/or cryptographic key storage so that the application runs only in the presence of the device.

However, current technological developments (virtualization, containers, distributed execution, multi-factor authentication, and the adoption of zero-trust architectures) make classic dongle-based solutions increasingly difficult to integrate and manage in modern environments [1,2]. In addition, users seek the simplest possible experience, while developers look for scalable, easy-to-maintain mechanisms, including from an operational perspective.

Specialized literature highlights an important differentiation between the integration levels of these devices: in some implementations, the dongle is used in a limited manner, primarily as a physical presence verification element (a possession token), without extended cryptographic processing capabilities, advanced authentication, or execution-protection mechanisms, while other works treat it as an active component of the security protocol and the application protection architecture.

Following a thorough analysis of the concepts addressed in the research studies on ensuring the security of information transmission and information services, the adoption of mechanisms for combining symmetric key cryptosystems to protect data files with public key cryptosystems for the safe transmission of session keys is noted. This ensures a well-defined separation of roles between the hardware key that handles the management of session keys and user permissions and the computer that handles data processing [3,4].

A second important direction targets authentication and integrity, through mutual software–device authentication, the use of per-session random numbers, dynamic passwords, and integrity codes (MAC), together with communication encryption; these approaches report good resistance to replay, forgery, and data manipulation, with minimal performance impact [5,6,7,8]. Extensions of these mechanisms move the dongle’s role beyond application-level validation toward permission verification already at system startup, or even before the operating system is loaded, including solutions with biometric authentication and BIOS-level verification, where results indicate increased security by shifting access control to an early stage of the boot process [9].

Another relevant direction in the current state of the art concerns application protection during execution and resistance to analysis/reverse engineering. In this area, solutions have been proposed that combine static protections (compression, encryption, code obfuscation) with dynamic protections (runtime monitoring, timestamps, anti-debugging processes), integrated together with bidirectional authentication between the software and physical keys. Reported results show that these mechanisms provide a high level of security, but they also come with a cost of performance, especially noticeable by increasing decryption time as file size increases [9]. At the same time, experimental studies show that simply checking for the presence of the dongle is not enough to ensure solid protection. Implementations that are based only on calls to the dongle API can be bypassed by identifying and removing these calls, and techniques such as fading or wrapping the software only make it harder to analyze, without completely eliminating the risk of compromise [10,11]. These observations highlight the difference between using the dongle as a simple presence device and using it as an active cryptographic element, directly integrated into the protocol, in authentication and in execution protection.

Another interesting approach is presented in [12,13,14] where the use of advanced nonlinear dynamical systems, including 3D cubic maps with dual memristors, memristive Rulkov neurons and non-degenerate 3D hyperchaotic systems, is promoted for securing multimedia data. Studies demonstrate that these mathematical models generate high-entropy sequences, providing efficient and robust algorithms for encrypting images and video segments.

The literature also presents extensions to secure group key distribution, secure data access in the cloud and industrial HMI/SCADA applications. In group communications, combining USB security with Shamir Secret Sharing and Lagrange interpolation has resulted in better protection of session keys and better resistance to internal and external attacks [15]. In cloud environments, the integration of digital signatures, PINs, and lockout mechanisms after failed attempts has proven effective for authentication and flexible access management [16]. In industrial environments, solutions based on hardware keys, two-factor authentication, and automated credential management have demonstrated greater resistance to attack attempts and a reduction in human error [17].

Based on these results, the need for a hardware protection solution that combines the advantages of a root of trust with the requirements of modern management, as well as secure updates, recovery mechanisms and adaptive security policies [18]. In practice, the transition to distributed services and infrastructures also changes where trust is granted. It is no longer enough for an application to check a simple licensing condition locally, as the environment can be copied, analyzed or virtualized. In the Zero Trust paradigm, no implicit trust is assumed in the user, the device, or the network; instead, access is conditioned by recurring verification and explicit policies [19]. Complementarily, multi-factor authentication has become a common mechanism for reducing the risk associated with static passwords, by combining multiple types of factors (knowledge, possession, and, where applicable, temporal factors).

In software protection projects, the possession factor can be represented by a physical device, but it must be integrated into a coherent flow that includes both user authentication and device validation and integrity verification, so that the solution remains reliable even against modern attacks and cloning or emulation scenarios. An additional limitation arises when the protocol between the application and the dongle is predictable enough to be reproduced. Even if a challenge–response protocol uses cryptography, unclear definition of the signed data, lack of strong device identity proofs, absence of anti-replay protections, and failure to bind operations to an authenticated session can enable emulation, replay, or instrumentation in virtualized environments. In practice, virtualization and remote execution make dongle integration more difficult, and some commercial solutions explicitly address these risks [20].

In addition, the lifecycle of a protection solution does not end at the moment of installation. Over the long term, operational requirements emerge such as firmware updates, prevention of downgrades to vulnerable versions, and the ability to recover after errors or compromise attempts. Guidelines such as NIST SP 800-193 [18] discuss principles of firmware resilience and the idea that a system should be able to prevent corruption, detect corruption, and recover in a controlled manner. In a hardware key, these requirements translate into mechanisms such as secure boot, protection of flash contents, rollback policies, and explicit recovery procedures.

In this context, the proposed project aims to develop a hardware key based on the ESP32-S3 GEEK that can support the protection of a software application through a complete flow covering both device authentication and user authentication, as well as maintaining a secure session. At a functional level, the system must be able to establish an encrypted and authenticated communication channel (for example, using an AEAD mode such as AES-GCM), enforce multi-factor user authentication, and provide the application with a verifiable result (token) to unlock protected functions.

From a cryptographic and identity-management perspective, the goal is for the device to have its own verifiable identity, modeled through a PKI infrastructure based on X.509 certificates and revocation via CRLs, so that a compromised or withdrawn dongle can be invalidated [21]. In addition, the project aims to eliminate reliance on a single static key by using hierarchical key derivation (HKDF/HMAC) to separate keys by purpose and reduce the impact of potential exposure [22]. User authentication is based on a knowledge factor (PIN), a temporal factor (TOTP), and a possession factor (the physical presence of the device via USB), in line with the principles of the TOTP standard [23].

2. Materials and Methods

2.1. Overview and Threat Model

The proposed solution is built as a distributed system in which the access decision for the application’s protected functions is not left solely to the software running on the PC, but is instead enforced by a dedicated hardware device (dongle) based on the ESP32-S3. The architecture aims to separate responsibilities as clearly as possible: the PC application manages the user interface and consumes the security results (a token), while the dongle manages the cryptographic identity, key derivation, session-channel establishment, and the execution of sensitive logic that should not be exposed within the application. In this way, even if the application were modified by an attacker, it would not be able to generate a valid result without carrying out a cryptographic dialogue with the device. Put differently, the architecture moves verification away from a local, easy-to-bypass check toward a mechanism in which the application is repeatedly forced to request proofs and responses from an external element designed to resist cloning and emulation. Moreover, because the system negotiates a session, results are not treated as fixed values but as values anchored to a temporal and session context (nonces, counters, expiration), which significantly reduces the effectiveness of replay attacks.

Conceptually, the architecture combines three main ideas that complement each other. The first is the existence of a Root of Trust on the device, supported by secure boot mechanisms and flash-content protection, so that the firmware running on the dongle is authentic and cannot be read or modified. In the case of the ESP32-S3, Secure Boot v2 performs cryptographic verification of the bootloader and application using RSA-PSS [24], and Flash Encryption encrypts the contents of flash memory so that physically reading the flash is not sufficient to recover code and data [25]. Through this combination, the firmware becomes a credible component: it boots only if it is authentic (integrity + authenticity) and preserves confidentiality under realistic storage-attack conditions.

The second idea is to build a device identity based on public key infrastructure, implemented through a digital certificate. In this way, the application can differentiate between an authentic dongle and an emulated dongle and can revoke all devices that have been compromised, using the X.509 certificate template and the CRL certificate revocation lists defined in RFC 5280. In practice, the device has its own private key and can generate, when the handshake is made, a signature that proves that this key is in its possession. The host then checks the signature using the public key in the certificate and validates it with the help of a trusted certification authority. Also, the certificate can be renewed without the need to generate a new key pair, so the private key will remain on the dongle and will not be replaced.

The third idea is to move from a simple validation of the presence of the hardware device to a stronger access mechanism such as three-factor user authentication (pin, one-time password generator and physical presence) and the application of adaptive policies based on the user’s behavior. To protect the confidentiality and integrity of the communication, the entire flow is carried out through a session channel encrypted using the AES-GCM algorithm, recommended by NIST SP 800-38D [21].

Figure 1 shows the general architecture of the system divided into four areas: the user’s computer running the application, the ESP32 S3-GEEK dongle running the AHSK (Adaptive Hardware Security Key) firmware, the public key infrastructure area and the device management area.

2.2. Hardware Device Used in the Implementation and Justification for the Selection

In this work, the Waveshare ESP32-S3-GEEK development board (Figure 2) was used because it provides, in a compact form factor, sufficient resources to implement a hardware key with cryptographic functions and device-level security mechanisms. It is built around the ESP32-S3 microcontroller, a modern SoC suitable for applications that require both processing performance and support for security features and communication with a host system.

The ESP32-S3-GEEK includes external memory (Flash and PSRAM), which allows the authentication logic, communication protocol, and logging mechanisms to run without major resource constraints, while the integrated USB interface facilitates using the device as a dongle directly connected to a PC. From a security perspective, the ESP32-S3 family is well suited for a dongle because it supports Root of Trust mechanisms through Secure Boot v2, which verifies the bootloader and application using signing schemes (RSA-PSS), reducing the risk of running modified or unsigned firmware. It can be programmed using two development tools: Arduino and MicroPython v1.28 or the ESP-IDF development framework.

Developing with Arduino or MicroPython offers flexibility and is easy to do, but limits the level of control over the device, which prevents it from being exploited to its full capacity. Furthermore, the platform allows a high degree of customization, since the developer has full control over the code on both the hardware side (firmware and security-mechanism configuration) and the software side (the host application and the communication protocol), unlike closed commercial solutions where features and security policies are typically predefined and harder to adapt.

The firmware is developed in C/C++, with maximum integration of the platform’s security mechanisms as well as the cryptographic libraries available in the ESP-IDF ecosystem (mbedTLS and related components). In the following sections, the implementation is presented step by step to clearly highlight the link between the platform’s security mechanisms and the system’s operational flow.

2.3. System Implementation

2.3.1. Stage 1—Root of Trust, PKI Identity, and Key Ladder

The system’s operation starts from the idea that the dongle must be a trusted element, capable of running only authorized firmware and keeping secrets in a way that withstands attempts at cloning or extraction. In the implementation (Figure 3), the ESP32-S3 device is configured so that its boot process is protected by the platform’s hardware mechanisms. At each startup, the firmware verification process runs to ensure that a previously modified version cannot be executed without being detected. Subsequently, if the firmware is validated, the process continues with the encryption of the flash memory, which means that if someone were to try to read the physical contents of the memory directly, the data obtained could neither be interpreted nor reused.

When the dongle is used for the first time, it enters a controlled provisioning mode in which it creates its cryptographic identity. The device already has a unique hardware fingerprint, represented by the MAC (Media Access Control) address stored in eFuse. This information is used as a baseline identifier and enables unique association of the device within the system. During provisioning, the dongle internally generates an ECC (Elliptic Curve Cryptography) key pair on the secp256r1 curve [26,27], using the hardware random-number generator. The resulting private key becomes the device’s identity key and is kept in internal memory in an area protected by flash encryption; the public key is used to issue a digital certificate.

To prove the device’s authenticity to the PC application, the dongle builds a CSR (Certificate Signing Request) containing the public key and identification metadata, such as a serial number derived from the MAC. The CSR is transferred to the manufacturer’s provisioning environment, where it is signed by a certification authority controlled by the manufacturer. As a result, the device’s X.509 certificate, called “Cert_dev”, is obtained and installed on the dongle. The certificate is not a secret and can later be transmitted to the host whenever needed, but the associated private key remains only on the dongle and never leaves the device.

After establishing the identity, the system addresses the essential aspect of key management, namely avoiding a static key stored in memory. The implementation is based on a key derivation mechanism that starts from a trusted key stored in eFuse that cannot be extracted, being accessed only through the HMAC (Keyed-Hash Message Authentication Code) hash function. The derivation continues by integrating the validated firmware and the anti-rollback counter stored in eFuse to ensure that the protection of the keys also depends on the state of the firmware. Thus, if there is any malicious modification of the firmware, the derived keys will no longer be valid and any attempt to perform sensitive operations can no longer be performed.

At the same time, the system classifies the keys according to their purpose of use in order to avoid using the same key for all operations. Another very important thing is that, when starting a new session, temporary session keys are generated based on new and random values created at that time. This way, even if a session can be compromised, the keys can no longer be reused when the device restarts or when they expire.

2.3.2. Stage 2—Three-Factor Authentication (3FA)

After the identification stage of the device is completed, the system must also verify who the user is; thus the three-factor authentication stage (Figure 4) is built: pin verification, TOTP code validation and physical presence of the device through the Universal Serial Bus interface.

The first factor is represented by the pin code known only to the user. This is processed during the initial configuration through a Key Derivation Function algorithm together with a salt stored in a secure manner. Subsequently, when the user wishes to authenticate, he enters the initially chosen pin which is processed and the result obtained is compared with the previously stored value, thus reducing side-channel attacks. If the entered pin is incorrect, a failure counter is incremented and a progressive delay is applied, to prevent repeated connection attempts. The higher the number of failures, the higher the risk score, taking the system from the normal state to the SUSPECT or LOCKDOWN state. If the entered pin is valid, the system moves on to the second factor of authentication TOTP.

The TOTP code is calculated and verified based on a dedicated key and the current time given by the device’s internal RTC (Real Time Clock) clock or NTP (Network Time Protocol) synchronization when possible.

The third factor is the physical presence of the ESP32 S3-GEEK device via the physical USB connection, which represents the element of possession. The dongle confirms that it is actually connected to the computer and that the data exchange takes place over the intended channel, which reduces the chances of a simulated remote communication being accepted.

2.3.3. Stage 3—Code-in-Dongle and Token Acquisition

The third stage illustrated in Figure 5 is the core of the proposed architecture because it moves the essential execution and decision elements into the dongle and thus eliminates attempts to bypass or modify the application’s functionalities. Unlike the solutions addressed in research works in the field of information protection, where the application decides locally whether it can be started or not, in the present work the application is rather a client that starts following the decision made by the ESP32 S3-GEEK device.

The process begins by initiating a handshake between the host and the device. If this process is validated, it continues with a proof-of-possession verification in which the dongle cryptographically demonstrates possession of the private key associated with its certificate. If the signature generated by the dongle is invalid, continuation of the process is refused because there is a suspicion of possible emulation. Instead, if the signature is valid, the secure channel is created between the dongle and the application, based on the session key derived from the root key.

Before any sensitive operation, the hardware device verifies the fulfillment of three-factor authentication and applies internal security and risk control policies. Only if all these conditions are met is the internal secure logic executed and the dongle responds with either an “OK” token accompanied by a lifetime (time-to-live), or with a “NO” response if the policy requires it or if authentication is not accepted. The application uses the token as proof of access and, if the token is valid and within the correct time window, unlocks functions or decrypts protected resources. In this way, even if an attacker attempts to modify the application to “skip” verification, they cannot forge the correct token because it is derived from keys that never leave the dongle and from a session context that continuously changes.

2.3.4. Stage 4—Anomaly Detection

To avoid situations where the system reacts only to success/failure without noticing repeated suspicious behavior, the system includes a risk-scoring engine and a set of adaptive policies, as shown in Figure 6. The dongle continuously collects relevant operational signals such as the number of failed authentication attempts, the frequency of OTP requests, the rate of execution requests, and changes in the host’s context. These signals are converted into a risk score obtained by a binary logistic regression scoring function based on normalized features.

Thus, the system can decide whether operation remains within normal parameters or whether decisions should be imposed. The dongle remains in the NORMAL operating state as long as the three-factor authentication is valid and the protected session is active. If several failed authentication attempts occur, the risk score exceeds the intermediate threshold and the device will switch to the SUSPECT operating state. The SUSPECT state involves the application of controlled restrictions that reduce the frequency of authentication attempts by introducing additional waiting times. In this way, the possibility of an automated attack is minimized.

If the number of failed connection attempts increases even in the SUSPECT state, the risk score will reach the maximum threshold and the dongle will switch to the LOCKDOWN state. In “LOCKDOWN”, executions are refused, and the device can wipe temporary session keys to limit any attempt at reuse. A clear status is displayed to the user, and the only way to return to normal operation is to trigger the recovery flow. In this way, the system not only protects access but also reacts proactively when signs of attack appear.

2.3.5. Stage 5—Hybrid Management: NTP, Encrypted Backup, and Signed OTA

The final stage, Figure 7, completes the system with mechanisms required for long-term operation and real-world scenarios.

First, TOTP depends on time, and the internal clock can drift. For this reason, the dongle periodically synchronizes its clock using NTP when connectivity is available. In the absence of internet access, the system continues to operate based on the RTC, and synchronization is performed once connectivity becomes available again. Second, the problem of device loss is addressed through an encrypted backup mechanism for the license state. ESP32-S3-GEEK creates a blob in which it stores the information necessary to restore the device in case it is lost or stolen. It is encrypted locally so that the content cannot be accessed and then sent to a local server or an MQTT (Message Queuing Telemetry Transport) broker.

If the user loses their device, they can receive a new one, and after completing three-factor authentication, the information stored in the blob is downloaded and restored to the new device. In this way, the physical loss of the dongle does not lead to the permanent loss of licenses, maintaining the balance between security and availability. Also, to minimize exposure to vulnerabilities and add improvements, firmware updates are performed periodically. The distribution of updates is done via OTA (over-the-air) and each firmware version is verified using a digital signature. If this signature is valid, then the new firmware version will be successfully installed and the anti-rollback counter will also be updated, so as to prevent rolling back to a previous firmware version that may have vulnerabilities.

Conversely, if the signature is not confirmed as valid, the firmware version will not be accepted and will be rejected. This step strengthens the link between the keys and the firmware fingerprint, ensuring long-term security.

3. Results

The development process of this project follows a sequence of well-defined stages, where each step confirms that the obtained result is stable enough to allow further progress. As shown in Figure 8, it begins with defining the requirements, followed by the gradual implementation of functionalities. The process then moves into the “Build” stage, where it is verified that the project compiles correctly. If the build is successful, the next step is “Flash”, meaning the firmware is uploaded to the ESP32, and after a successful flash the process enters the testing stage to validate correct operation.

3.1. Host Application Integration and Secure Session Initialization

Figure 9 shows the processing time for cryptographic operations on the ESP32-S3 and the end-to-end latency from the application requesting an unlock to the dongle returning the token. All cryptographic operations are completed in less than 53 milliseconds, below the human perception threshold of 0.1 s established by Jakob Nielsen in the book Usability Engineering [28]. The largest latency in a complete session is the time it takes the user to retrieve and enter the TOTP code (about five to ten seconds), which is independent of the hardware platform. From the perspective of the host application, each individual dongle response arrives within a video frame (<60 milliseconds), ensuring seamless integration with no perceptible delays. The AES-128-GCM channel encryption adds only 0.7 milliseconds of overhead per round-trip command, making it negligible compared to the latency of USB serial transfer.

The resulting execution times were measured directly on the ESP32-S3 GEEK development board using the “esp_timer_get_time()” function. Each operation was repeated 30 times, each time in a new measurement session, and the values shown represent the average of the values obtained from these measurements. The standard deviation for all operations was very small, being below 2% of the average value, indicating a low variation under stable operating conditions. There were no abnormal values in the tests performed that exceeded the mean ± 3σ.

The tests were performed on a host PC running Windows 11 (13th generation Intel Core i9 processor and 16 GB RAM), connected to the ESP32-S3 GEEK via USB Serial connection at 115200 baud. The firmware was compiled in Release mode (CONFIG_COMPILER_OPTIMIZATION=y) with ESP-IDF v5.3, and the ESP32-S3 ran at 240 MHz, with the AES and SHA hardware accelerators enabled. End-to-end latency was determined in Python 3.14.4 using the “time.perf_counter()” function, from sending the command to processing the response.

3.2. TOTP Provisioning and Successful Three-Factor Authentication (3FA)

The result of generating the TOTP secret for the third authentication factor and its integration with Google Authenticator can be observed in Figure 10 through the display of a dedicated configuration window. The user receives a QR code generated from an otpauth-type URI, along with the secret in Base32 format, so enrollment can be performed either by scanning or by manual entry. This step is executed only once, during first-time use.

The interface transition after provisioning is completed—when the text in the password area changes from “Entering the initial password (provisioning)” to “Entering the previously set password” (Figure 11)—confirms that the application detects the device’s state and automatically adapts the message shown to the user, reflecting that the password has already been set and will now be used as a constant authentication factor. The log shows the sequence of messages indicating secret generation, display of the setup window, password validation, and then completion of OTP-based authentication, concluded with the success message “UNLOCK OK (3FA)”.

3.3. Negative Testing: Anti-Brute-Force and Lockout Behavior

Figure 12 illustrates negative testing scenarios and the system’s behavior in the face of repeated unsuccessful attempts, both for the wrongly entered password and for the incorrect OTP. It can be seen that the device runs the lockout policy for a temporary period of time when the password or OTP code is repeatedly entered incorrectly, thus preventing brute force attacks. Following the triggering of this process, any subsequent attempt to validate the password or OTP code is rejected, which interrupts the authentication flow and minimizes the attacker’s attempts to continue.

The risk was calculated using binary logistic linear regression

r i s k = s i g m o i d (1.40 + s u m (w e i g h t [i] x f e a t u r i n g [i]))

(1)

where

s i g m o i d (z) = 1 / (1 + e^{- z})

(2)

maps any real value to a probability in the interval [0, 1].

This solution was chosen taking into account the constraints specific to the proposed hardware configuration, namely the absence of training datasets necessary to implement a machine learning model and the limited computational budget of the ESP32-S3. The weight values were assigned manually, by expertise, using a constraint satisfaction process. Thus, four operational rules were defined: (C1) a fully authenticated legitimate user must have a risk <0.25; (C2) three consecutive failed authentication attempts must lead to a score above the SUSPECT threshold (0.60); (C3) a very large number of commands in a short time, specific to script automation, must be marked at least as SUSPECT; (C4) a pass_pending state should not be classified as SUSPECT during authentication to avoid false positives in the case of legitimate use. The weights were adjusted until all four rules were met simultaneously, then validated through the ten representative scenarios in Table 1.

Each RBLS decision can be audited by leveraging eight features (Table 2) selected to cover the surface of possible attack vectors in the threat model: password guessing by brute force (fail_norm), actions performed automatically by script (cmd_rate), repeated abusive behavior (abuse), possibly compromised session (pass_pend) and indicators that define the operational and valid state of the device (provisioned, time_set and uptime).

The weights associated with the features were set as follows: unlocked = −1.40, provisioned = −1.10, time_set = −1.10, fail_norm = +1.80, pass_pend = +0.80, cmd_rate = +2.20, abuse = +2.80 and uptime = +0.10. The highest positive weight was assigned to the abuse variable because it captures suspicious behaviors that are repeated over time. In contrast, uptime received a very low value because it offers little relevance in risk assessment.

3.4. Resource Footprint: Flash/DIRAM/IRAM Utilization

Regarding the memory consumed, it is noted in Table 3 that the implementation of the proposed solution falls within and functions properly within the limits of the platform’s hardware constraints. An overall balanced Flash usage can be observed for both code and constant data, and at the DIRAM level there is a comfortable margin for future extensions, indicating that the implemented functionalities are integrated in a reasonable way from a resource standpoint.

At the same time, the analysis highlights that IRAM is almost fully utilized. The 99.99% IRAM usage reported by ESP-IDF is not a memory shortage, but a normal behavior of the ESP32-S3 architecture. Unlike the original ESP32, the ESP32-S3 does not have a dedicated IRAM bank separate from the data memory, and all internal SRAM is unified and can be used for both data and executable code.

The IRAM segment reported by idf.py size is not a fixed memory area, but one automatically calculated by the linker based on how much code needs to run from RAM, for example IRAM_ATTR functions and interrupt routines. Therefore, this area usually appears almost completely occupied.

In the case of the AHSK firmware, the application only occupies 48 bytes of this segment through the uptime_sec() and usb_write() functions. The remaining 15,308 bytes come from mandatory ESP-IDF components, such as the FreeRTOS scheduler, HAL, interrupt layer, and Xtensa exception vectors, which cannot be moved to flash without affecting operational safety. In addition, certain optional device functionalities can be disabled to further reduce memory consumption without affecting currently implemented functionality.

4. Discussion

The implemented solution is based on the principle of “defense in depth”, starting from the idea that modern attacks can target multiple links in the “chain of trust”. Instead of relying on a single protective barrier, the solution integrates a set of correlated controls that address specific vulnerabilities—from the instability of the host environment and the communication channel’s exposure to replay attacks, to binary integrity and data confidentiality on the embedded device. This layered approach increases the system’s resilience, ensuring that a breach at the level of a single component does not lead to the collapse of the entire security mechanism.

Taking real-world threats as a reference, the architecture presents how attacks manifest themselves in practice. Dongle emulation (spoofing) is mitigated by PKI-based identity and proof-of-possession signatures, so a simple protocol emulator cannot reproduce valid cryptographic proofs without the device private key. Host patching/bypass (tampering) is addressed by moving sensitive authorization logic into the device through the Code-in-Dongle concept: the host no longer “decides locally” based on interceptable checks, but requests a protected computation and receives only a bounded result (token) that is tied to authenticated context and policy. Replay and message injection are mitigated by an AEAD-protected (Authenticated Encryption with Associated Data) session with explicit anti-replay logic (e.g., sequence/TTL). Online guessing and abuse (DoS) against PIN/TOTP are mitigated through 3FA combined with rate limiting/backoff and lockout behavior. Secret disclosure through flash readout is reduced by eliminating static secrets and using a key ladder anchored in device trust, complemented by flash protection mechanisms. Firmware downgrade is mitigated by secure boot and anti-rollback using eFuse counters.

Finally, the issue of time manipulation, which can influence the operation of the one-time password generator, is addressed by using the RTC clock together with NTP synchronization and with a margin of ±1 step. Because the TOTP mechanism depends directly on the device’s internal clock, timing has been a challenge since the initial testing phases. In the absence of automatic synchronization, correct OTP codes were automatically rejected. To solve this flawed functionality, the SETTIME command was introduced to automatically set the time, based on the host system time, thus stabilizing the authentication process and eliminating unjustified rejections. Being dependent on the host system clock which can be manipulated, there is a weakness of the system, but with multiple factor authentication, this problem is diminished. In addition to the security mechanisms, the development of the system also involved solving technical problems specific to the integration between hardware and software, as well as the limitations imposed by the chosen platform. One of the most difficult aspects was stabilizing the serial communication, along with the correct synchronization of the application’s internal processes and configuring the waiting time limits. These measures were taken to avoid situations where the serial port would freeze or errors would occur due to response timeouts.

As new functionalities were gradually integrated, challenges specific to the Espressif IoT Development Framework environment emerged, with a focus on the partition table and the memory sizes allocated to different sections. The addition of new components led to an increase in the size of the binary files and storage requirements, which is why the partition table needed to be reconfigured several times to avoid compilation errors and to ensure sufficient space for all the implemented functionalities.

At the same time, it also highlights some limitations that need to be taken into account in future developments. The results from the build stage show that although the almost completely used IRAM memory does not affect the current operation and represents normal behavior, it can become a problem when new functions that need to be executed from RAM are desired. For this reason, future versions will require optimizing or disabling some non-essential device functionalities (e.g., reducing the instruction cache).

It is important to note that the evaluation results presented in this paper represent a prototype-level validation rather than a full security assessment against an adversary. Resistance to advanced physical attacks, including side-channel analysis (differential power analysis, electromagnetic emission analysis), error injection, and chip-level extraction, have not been experimentally evaluated and remain a future research activity.

Author Contributions

Conceptualization, A.-I.P.; Methodology, F.-D.A.; Software, A.-I.P.; Validation, A.-I.P.; Formal analysis, F.-D.A.; Investigation, A.-I.P. and F.-D.A.; Resources, A.-I.P. and F.-D.A.; Data curation, A.-I.P.; Writing—original draft, A.-I.P. and F.-D.A.; Writing—review & editing, F.-D.A.; Visualization, A.-I.P.; Supervision, F.-D.A.; Project administration, A.-I.P.; Funding acquisition, F.-D.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

3FA	Three-Factor Authentication
AEAD	Authenticated Encryption with Associated Data
AES	Advanced Encryption Standard
AES-GCM	Advanced Encryption Standard in Galois/Counter Mode
AHSK	Adaptive Hardware Security Key
CA	Certificate Authority
CRL	Certificate Revocation List
DoS	Denial of Service
ECC	Elliptic Curve Cryptography
ECDSA	Elliptic Curve Digital Signature Algorithm
ESP-IDF	Espressif IoT Development Framework
HKDF	HMAC-based Key Derivation Function
HMAC	Hash-based Message Authentication Code
KDF	Key Derivation Function
MQTT	Message Queuing Telemetry Transport
NTP	Network Time Protocol
OTA	Over-The-Air
OTP	One-Time Password
PKI	Public Key Infrastructure
RTC	Real-Time Clock
SHA-256	Secure Hash Algorithm 256-bit
USB CDC	Universal Serial Bus Communications Device Class

References

Katz, J.; Lindell, Y. Introduction to Modern Cryptography Revised Third Edition; CRC Press: Boca Raton, FL, USA, 2025. [Google Scholar]
Stallings, W. Cryptography and Network Security: Principles and Practice, Global Edition, 8th ed.; Pearson: New York, NY, USA, 2024. [Google Scholar]
Hu, G. Study of file encryption and decryption system using security key. In 2010 2nd International Conference on Computer Engineering and Technology, Chengdu, China; IEEE: New York, NY, USA, 2010; pp. V7-121–V7-124. [Google Scholar] [CrossRef]
Neacsu, E.; Schiopu, P. A Security Analysis of Public Key Cryptohraphic Systems Used for Electronic Signature. Univ. Politeh. Buchar. Sci. Bull. Ser. C 2021, 83, 135–146. [Google Scholar]
Li, M.H.; Liu, J.Q. USB Key-Based Approach for Software Protection. In Proceedings of the 2009 International Conference on Industrial Mechatronics and Automation, Chengdu, China, 15–16 May 2009. [Google Scholar] [CrossRef]
Wang, C. The Solution Design Using USB Key for Network Security Authentication. In 2012 Fourth International Conference on Computational Intelligence and Communication Networks, Mathura, India; IEEE: New York, NY, USA, 2012; pp. 766–769. [Google Scholar] [CrossRef]
Wang, J.; Xu, Z.; Chang, X.; Xiao, C.; Zhang, L. Design and Implementation of USB Key System Based on Dual-Factor Identity Authentication Protocol. J. Electron. Res. Appl. 2024, 8, 161–167. [Google Scholar] [CrossRef]
Simion, E.; Patrascu, A. Applied Cryptography and Practical Scenarios for Cyber Security Defense. Univ. Politeh. Buchar. Sci. Bull. Ser. C 2013, 75, 131–142. [Google Scholar]
Li, G.; Chen, H. A new high-level security portable system based on USB Key with fingerprint. In 2010 International Conference On Computer Design and Applications, Qinhuangdao, China; IEEE: New York, NY, USA, 2010; pp. V1-159–V1-162. [Google Scholar] [CrossRef]
An, Y.; Zhao, B.; Li, Y. Research on Software Protection Method Based on USBKey. In 2013 International Conference on Computer Sciences and Applications, Wuhan, China; IEEE: New York, NY, USA, 2013; pp. 210–213. [Google Scholar] [CrossRef]
Liutkevicius, A.; Vrubliauskas, A.; Kazanavicius, E. Assessment of Dongle-based Software Copy Protection Combined with Additional Protection Methods. Electron. Electr. Eng. 2011, 112, 111–116. [Google Scholar] [CrossRef]
Gao, S.; Iu, H.H.-C.; Erkan, U.; Simsek, C.; Toktas, A.; Cao, Y.; Wu, R.; Mou, J.; Li, Q.; Wang, C. A 3D Memristive Cubic Map With Dual Discrete Memristors: Design, Implementation, and Application in Image Encryption. In IEEE Transactions on Circuits and Systems for Video Technology; IEEE: New York, NY, USA, 2025; Volume 35, pp. 7706–7718. [Google Scholar] [CrossRef]
Gao, S.; Zhang, Z.; Li, Q.; Ding, S.; Iu, H.H.-C.; Cao, Y.; Xu, X.; Wang, C.; Mou, J. Encrypt a Story: A Video Segment Encryption Method Based on the Discrete Sinusoidal Memristive Rulkov Neuron. In IEEE Transactions on Dependable and Secure Computing; IEEE: New York, NY, USA, 2025; Volume 22, pp. 8011–8024. [Google Scholar] [CrossRef]
Lin, Y.; Liao, Y.; Zeng, W.; Wei, Y.; Chen, D.; Yuan, X.; Li, Y.; Erkan, U.; Toktas, A.; Zhang, C.; et al. 3D Non-degenerate Hyperchaos: Design, Analysis, and Application in Image Encryption. In IEEE Transactions on Consumer Electronics; IEEE: New York, NY, USA, 2026. [Google Scholar] [CrossRef]
Thomas, G.; Jamaludheen K, J.; Sibi, L.; Maneesh, P.; Mufeedh. A novel mathematical model for group communication with trusted key generation and distribution using shamir’s secret key and USB security. In 2015 International Conference on Communications and Signal Processing (ICCSP), Melmaruvathur, India; IEEE: New York, NY, USA, 2015; pp. 435–438. [Google Scholar] [CrossRef]
Wang, F.; Wang, H.; Chen, X. Research on access control technology of big data cloud computing. In Proceedings of the 2023 IEEE 3rd International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA), Chongqing, China, 26–28 May 2023; pp. 851–856. [Google Scholar] [CrossRef]
Abu-Jassar, A.T.; Attar, H.; Yevsieiev, V.; Amer, A.; Demska, N.; Luhach, A.K.; Lyashenko, V. Electronic User Authentication Key for Access to HMI/SCADA via Unsecured Internet Networks. Comput. Intell. Neurosci. 2022, 2022, 5866922. [Google Scholar] [CrossRef] [PubMed]
NIST SP 800-193; Platform Firmware Resiliency Guidelines. National Institute of Standards and Technology: Gaithersburg, MD, USA, 2018. [CrossRef]
Rose, S.; Borchert, O.; Mitchell, S.; Connelly, S. Zero Trust Architecture; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2020. [CrossRef]
Kügler, R. CodeMeter^® in Virtual Environments. White Paper. 2014. Available online: https://support.buildsoft.eu/wp-content/uploads/2018/07/Wibu-Systems_White_Paper_CodeMeter_in_Virtual_Environments.pdf (accessed on 20 April 2026).
NIST SP 800-38D; Recommendation for Block Cipher Modes of Operation: Galois/Counter Mode (GCM) and GMAC. National Institute of Standards and Technology: Gaithersburg, MD, USA, 2007. Available online: https://csrc.nist.gov/pubs/sp/800/38/d/final (accessed on 20 April 2026).
Krawczyk, H.; Eronen, P. HMAC-based Extract-and-Expand Key Derivation Function (HKDF); RFC5869; RFC Editor, 2010. Available online: https://www.rfc-editor.org/rfc/rfc5869.html (accessed on 20 April 2026).
M’Raihi, D.; Machani, S.; Pei, M.; Rydell, J. TOTP: Time-Based One-Time Password Algorithm; RFC6238; RFC Editor, 2011. Available online: https://www.rfc-editor.org/rfc/rfc6238.html (accessed on 20 April 2026).
Secure Boot v2. Espressif ESP-IDF (ESP32-S3). Available online: https://docs.espressif.com/projects/esp-idf/en/stable/esp32s3/security/secure-boot-v2.html#secure-boot-v2 (accessed on 20 April 2026).
Flash Encryption. Espressif ESP-IDF (ESP32-S3). Available online: https://docs.espressif.com/projects/esp-idf/en/stable/esp32s3/security/flash-encryption.html (accessed on 20 April 2026).
Chen, L.; Moody, D.; Regenscheid, A.; Robinson, A.; Randall, K. Recommendations for Discrete Logarithm-based Cryptography: Elliptic Curve Domain Parameters; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2023. [CrossRef]
Kong, Y.; Tian, J. An ECC-Based Anonymous and Fast Handover Authentication Protocol for Internet of Vehicles. Appl. Sci. 2025, 15, 5894. [Google Scholar] [CrossRef]
Nielsen, J. Usability Engineering; Academic Press, Inc.: Cambridge, MA, USA, 1993. [Google Scholar]

Figure 1. Architecture of the Adaptive Hardware Security Key.

Figure 2. ESP32-S3 GEEK.

Figure 3. Root of Trust, PKI Identity and Key Ladder.

Figure 4. Three-Factor Authentication (3FA).

Figure 5. Code-in-Dongle and Token Issuance.

Figure 6. Anomaly Detection, SUSPECT, and LOCKDOWN.

Figure 7. Hybrid Management: NTP, Encrypted Backup, and Signed OTA Updates.

Figure 8. System Development Process in the ESP-IDF Development Environment.

Figure 9. Latency of cryptographic operations on ESP32-S3.

Figure 10. TOTP enrollment step: provisioning a new secret using an otpauth URI (QR) and Base32 representation.

Figure 11. Post-provisioning interface transition and successful authentication (PASS + TOTP), concluding with “UNLOCK OK (3FA)”.

Figure 12. Anti-brute-force behavior, progressive delay vs. risk score.

Table 3. Memory usage report for build—resource distribution (Flash/DIRAM/IRAM).

Memory Type/Section	Used [Bytes]	Used [%]	Remain [Bytes]	Total [Bytes]	AHSK Used [Bytes]
Flash Code	255,238	16.23	1,317,626	1,572,864	16,388
.text	255,238	16.23	-	-	16,388
Flash Data	93,856	5.97	1,479,008	1,572,864	1200
.rodata	93,600	5.95	-	-	1200
.appdesc	256	0.02	-	-	0
DIRAM	64,343	18.83	277,417	341,760	2896
.text	48,219	14.11	-	-	0
.data	11,276	3.3	-	-	848
.bss	4848	1.42	-	-	2048
IRAM	16,383	99.99	1	16,384	48
.text	15,356	93.73	-	-	48
.vectors	1027	6.27	-	-	0
RTC FAST	24	0.29	8168	8192	0
.rtc_reserved	24	0.29	-	-	0

Table 1. Risk evaluation scenarios.

ID	Scenario	Description	$z$	Risk	Decision
S₁	Normal scenarios	Authenticated user, successful unlock, no incidents	−2.070	0.112	ALLOW
S₂	Normal scenarios	New session, no unlock, provisioned, time set, idle	−0.680	0.336	ALLOW
S₃	Normal scenarios	Password accepted, TOTP pending, legitimate user	0.120	0.530	ALLOW
S₄	Brute-force—wrong passwords	1 failed attempt, fail_norm = 0.17, abuse = 1	−0.029	0.493	ALLOW
S₅	Brute-force—wrong passwords	3 failed attempts, fail_norm = 0.50, abuse = 3, lockout	1.380	0.799	SUSPECT
S₆	Brute-force—wrong passwords	5 failed attempts, fail_norm = 0.83, abuse = 5	2.684	0.936	DENY
S₇	Automation/botnet	High command rate, 15 cmd/10 s, abuse = 1	1.205	0.769	SUSPECT
S₈	Automation/botnet	Maximum combined attack: maximum rate + maximum abuse + failed attempts	5.719	0.997	DENY
S₉	Special scenarios	Unprovisioned device, first boot, no credentials	1.401	0.802	SUSPECT
S₁₀	Special scenarios	Successful unlock but suspicious behavior, high rate	0.290	0.572	ALLOW

Table 2. Feature weights and values across validation scenarios.

Feature	w_i	S₁	S₂	S₃	S₄	S₅	S₆	S₇	S₈	S₉	S₁₀
unlocked	−1.40	1.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	1.00
provisioned	−1.10	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	0.00	1.00
time_set	−1.10	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	0.00	1.00
fail_norm	+1.80	0.00	0.00	0.00	0.17	0.50	0.83	0.00	0.83	0.00	0.00
pass_pend	+0.80	0.00	0.00	1.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00
cmd_rate	+2.20	0.05	0.05	0.05	0.05	0.10	0.10	0.75	1.00	0.00	0.80
abuse	+2.80	0.00	0.00	0.00	0.13	0.38	0.63	0.13	1.00	0.00	0.25
uptime	+0.10	0.20	0.10	0.10	0.10	0.10	0.15	0.05	0.20	0.01	0.30

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Popovici, A.-I.; Anton, F.-D. ESP32-Based Hardware Key for Software Application Protection. Appl. Sci. 2026, 16, 4251. https://doi.org/10.3390/app16094251

AMA Style

Popovici A-I, Anton F-D. ESP32-Based Hardware Key for Software Application Protection. Applied Sciences. 2026; 16(9):4251. https://doi.org/10.3390/app16094251

Chicago/Turabian Style

Popovici, Alexandru-Ion, and Florin-Daniel Anton. 2026. "ESP32-Based Hardware Key for Software Application Protection" Applied Sciences 16, no. 9: 4251. https://doi.org/10.3390/app16094251

APA Style

Popovici, A.-I., & Anton, F.-D. (2026). ESP32-Based Hardware Key for Software Application Protection. Applied Sciences, 16(9), 4251. https://doi.org/10.3390/app16094251

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

ESP32-Based Hardware Key for Software Application Protection

Featured Application

Abstract

1. Introduction

2. Materials and Methods

2.1. Overview and Threat Model

2.2. Hardware Device Used in the Implementation and Justification for the Selection

2.3. System Implementation

2.3.1. Stage 1—Root of Trust, PKI Identity, and Key Ladder

2.3.2. Stage 2—Three-Factor Authentication (3FA)

2.3.3. Stage 3—Code-in-Dongle and Token Acquisition

2.3.4. Stage 4—Anomaly Detection

2.3.5. Stage 5—Hybrid Management: NTP, Encrypted Backup, and Signed OTA

3. Results

3.1. Host Application Integration and Secure Session Initialization

3.2. TOTP Provisioning and Successful Three-Factor Authentication (3FA)

3.3. Negative Testing: Anti-Brute-Force and Lockout Behavior

3.4. Resource Footprint: Flash/DIRAM/IRAM Utilization

4. Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI