Review Reports
- Christian Schwinne* and
- Jan Pelzl
Reviewer 1: Mario Kusek Reviewer 2: Anonymous Reviewer 3: Ahmed Zouinkhi
Round 1
Reviewer 1 Report
Authors approach security in different way and propose local communication with IoT device by using plain HTTP (without TLS) but they introduced additional mechanism that uses hashing, nuance and HMAC to improve local security. This is original idea and valid one.
I don't see any flaws in the approach besides what authors already stated in discussion and conclusion.
In the introduction in the part with flexibility (lines 80-91) I would add that users need to have lots of apps on phone to use IoT devices from different vendors.
Section 2 (lines 127-129) is so small that I would add it as last paragraph of introduction.
Figure 3 - What arrows mean? E.g. arrow between "user enters password..." and "derive pre-shared key...". What it represents? It is not the same as arrows pointing left and right. Similar for other arrows. If you can have different type of arrows and legend it would be much better. It is not clear what is order of execution, so putting numbers or explaining in detail in text would be better.
In line 451 you have "visualized below.". Is that figure 4 or 3?
Figure 4 - The same commend as to figure 3. I assume that "PSK from initialization" and "Nonce from initialization" are representing that this information is needed for "HMAC-authenticate ..." and "Control command ..." is for starting execution of "HMAC-authenticate ...". This should have different arrows and legend describing them.
I recommend to add sequence diagram of execution to see clearly how it is executed in time.
For discussion I have some questions that are not answered:
- How will your approach work when there are more devices (10-50)? As I understand for each device I need to have 2 tabs open so if I want to control 10 devices I need to have 20 tabs open. Am I correct? That is not useful.
- What user needs to open in the beginning of flow? It is not clear start of using your approach. It would be much better to have sequence diagram and screenshots as example to show execution.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report
This manuscript proposes a fully local, browser-based method to authenticate commands between a web client and resource-constrained IoT devices without TLS. It uses a “secure crypto tab” loaded in a Secure Context to derive a PSK from a user password via PBKDF2 and to compute HMACs, then relays authenticated messages to a device UI loaded over HTTP using window.postMessage. A WLED proof-of-concept on ESP8266/ESP32 shows low RAM overhead and feasibility of MAC-only transport for integrity and origin authentication. The work argues this design improves privacy and sustainability by decoupling from cloud control while remaining lightweight for constrained hardware. However, I recommend Major Revision for the following reasons.
- The security goals and adversary are not formalized. The text alternates between mixed-content constraints, MitM on HTTP, and UI injection without a unified model. Add a precise system and threat model covering network, browser, and local attackers. Specify what each party controls, capabilities for content injection, origin spoofing, replay, DoS, UI redress, and physical access. Tie each design choice to an explicit threat and a testable security objective.
- The initial handshake sends postMessage('*'), then relies on event.origin later. This creates a time-of-check issue where an unintended opener may race messages.
- The HTTP-served UI can be modified by an on-path adversary to request HMACs for attacker-chosen payloads. The paper mentions this limitation but does not mitigate it in the PoC.
- There is no detailed workflow for PSK provisioning, rotation, revocation, or handling multiple users and devices. Session counters and nonces are mentioned but lifecycle and recovery are not covered.
- PBKDF2 is selected only because it is available in Web Crypto. Parameters and expected work factor are not justified against modern GPU attackers.
- The evaluation omits comparisons with TLS-PSK, DTLS for IoT, Noise patterns, or Matter’s local control. Add a baseline comparison on code size, RAM, CPU, and latency versus TinyDTLS, wolfSSL-PSK, or a Noise-PSK handshake, and discuss security trade-offs. Include ESP32 results, not only ESP8266.
- Provide a protocol description in formal notation and analyze it with ProVerif or Tamarin under the Dolev-Yao model, covering authentication, replay, and channel binding between the two browser contexts.
- Important mitigations such as CSP, COOP/COEP, Referrer-Policy, X-Frame-Options, and robust input validation for messageEvent.data are not specified. Include a security headers profile, strict schema validation for all inter-window messages, and tests for clickjacking, XSS in device UI, and service worker misuse.
- Nonce handling is described but not stress-tested under packet reordering, reconnects, or clock and counter resets.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 3 Report
This manuscript propose a novel, fully local approach for secure web-based communication between browser clients and resource-constrained IoT devices. The idea of using browser inter-window messaging as a lightweight alternative to TLS for local device control is interesting and relevant to privacy-preserving IoT systems. The paper is clearly written overall, with solid background explanations and a convincing proof-of-concept implementation. However, there are some issues in technical validation, and methodological justification that should be improved before publication. A few minor typographical errors were also noted.
Comments to the authors:
- Line 64: “less then” should be “less than”.
- Line 131: “encompass both the security objectives the solution aims to achieve”. Missing comma after objectives.
- The Results section focuses mainly on resource utilization. While this is informative, it would strengthen the work to include at least a small quantitative performance comparison (e.g., latency, response time, or energy consumption) between the proposed approach and a lightweight TLS implementation such as mbedTLS.
- It would be helpful to clarify whether the HMAC-based system was stress-tested under repeated command loads or concurrent browser sessions.
- The paper discusses the Browser Cryptography Chicken and Egg Problem in depth, but the actual threat model used for evaluating the solution remains a bit vague. What specific attacker capabilities are assumed (e.g., local network attacker, compromised browser tab, etc.)?
- Please also clarify whether replay attack mitigation was formally verified or only conceptually described (nonce-based).
- The evaluation of the 1 KB memory overhead is interesting, but it would be useful to explain how this number was measured (compile-time vs runtime diagnostics).
- The authors should consider providing a fallback or future browser API direction to make this approach more seamless.
- The Discussion section is somewhat descriptive. A more comparative discussion with similar local control frameworks such as Matter, Web of Things, or CoAP/DTLS could better position this contribution in the existing landscape.
- Consider integrating a minimal benchmark showing how the HMAC generation time on ESP8266 compares to SHA-1 or ChaCha-based HMAC to demonstrate efficiency.
- Adding an experimental evaluation of latency between the browser crypto tab and the device would greatly improve the credibility of the proposal.
Thank you for your contribution
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Round 2
Reviewer 2 Report
The authors have satisfactorily modified their manuscript according to my previous criticisms. Therefore, I recommend the publication of this manuscript.
The authors have satisfactorily modified their manuscript according to my previous criticisms. Therefore, I recommend the publication of this manuscript.