PRIVocular: Enhancing User Privacy Through Air-Gapped Communication Channels

Bikos, Anastasios N.

doi:10.3390/cryptography9020029

Open AccessArticle

PRIVocular: Enhancing User Privacy Through Air-Gapped Communication Channels

by

Anastasios N. Bikos

Department of Computer Engineering and Informatics, University of Patras, 26504 Patras, Greece

Cryptography 2025, 9(2), 29; https://doi.org/10.3390/cryptography9020029

Submission received: 5 March 2025 / Revised: 18 April 2025 / Accepted: 21 April 2025 / Published: 1 May 2025

Download

Browse Figures

Versions Notes

Abstract

Virtual reality (VR)/the metaverse is transforming into a ubiquitous technology by leveraging smart devices to provide highly immersive experiences at an affordable price. Cryptographically securing such augmented reality schemes is of paramount importance. Securely transferring the same secret key, i.e., obfuscated, between several parties is the main issue with symmetric cryptography, the workhorse of modern cryptography, because of its ease of use and quick speed. Typically, asymmetric cryptography establishes a shared secret between parties, after which the switch to symmetric encryption can be made. However, several SoTA (State-of-The-Art) security research schemes lack flexibility and scalability for industrial Internet-of-Things (IoT)-sized applications. In this paper, we present the full architecture of the PRIVocular framework. PRIVocular (i.e., PRIV(acy)-ocular) is a VR-ready hardware–software integrated system that is capable of visually transmitting user data over three versatile modes of encapsulation, encrypted—without loss of generality—using an asymmetric-key cryptosystem. These operation modes can be optical character-based or QR-tag-based. Encryption and decryption primarily depend on each mode’s success ratio of correct encoding and decoding. We investigate the most efficient means of ocular (encrypted) data transfer by considering several designs and contributing to each framework component. Our pre-prototyped framework can provide such privacy preservation (namely virtual proof of privacy (VPP)) and visually secure data transfer promptly (<1000 ms), as well as the physical distance of the smart glasses (∼50 cm).

Keywords:

metaverse security; OCR; privacy; privacy preservation; QR; secure visual key exchange; virtual proof; virtual reality cybersecurity; visual cryptography

1. Introduction

Virtual reality (VR)/extended reality (XR) technology has advanced significantly over the last few years. It is indicated that 2016 was the year that virtual reality went from virtual to reality [1]. VR encompasses a collection of technologies (3D displays, input devices, UHD cameras, wireless network protocols, software frameworks, etc.) that aim to create an interactive medium that offers human users the feeling of being immersed. Hence, the evolution of consumer-grade hardware (such as Oculus Rift and HTC Vive), as well as the flexibility and portability of software platforms, ie, Android, to create and display VR content, strongly suggests that this field could be the next wave of success of computer technology [1]. Interestingly, consumer prices for these VR devices have also been steadily dropping, creating a huge potential for greater mass availability to the public [2].

Such immersive devices can be smart glasses, tablets, or smartphones. As part of an integrated VR framework, these devices can utilize the visible spectrum and, together with their UHD cameras, can easily capture and further process visual data in various formats. Data transmission on the visible spectrum can be tedious due to various aspects that affect optical performance, i.e., room lighting conditions, display reflections, and contrast. Combined with the specific format (optical characters, images, code tags, etc.) of the visually displayed data, there is a trade-off between the performance of correct (error-free) optical transmission and the amount of data to be transmitted optically.

At the moment, the majority of research on augmented reality (AR) security focuses on user-to-user or user-to-infrastructure authentication to create secure communication and thwart potential threats, including man-in-the-middle attacks. Through AR technology, people can interact with virtual objects and information in their real-world surroundings. Making virtual things, however, can take a lot of effort, and sadly, malevolent people might illegally duplicate these objects and include them in their own augmented reality settings. Verifying if AR objects and the content they interact with are authentically integrated is crucial. In this research, a unique approach to authenticating, securing and privacy preserving AR-delivered material employing asymmetric data-hiding strategies is proposed.

The use of the most optimal means of visual transfer of ocular data (e.g., optical character recognition [3,4,5,6,7], Quick Response Code [8,9,10,11,12,13,14,15]) has several areas of applicability, provided we receive a relatively large amount of data. At the same time, via the photon carrier, or visible domain, between a computer display and a UHD camera of a VR device, we can then process user information into several applications or vertical industries (e.g., finance, medical, and military activities). We can manipulate an image code tag or optical characters from a display screen to retrieve industrial codes for product tags via camera sensors or to tag patient codes in hospitals for medical history queries. However, the security of data transmitted through optical means can be compromised. Hence, these data could be considered confidential or user private in most applicable cases [16,17]. Although the science fiction metaverse is not yet a reality, it is feasible to understand what it would look like. However, software is not the only aspect of the metaverse as it is currently envisioned. Many businesses use virtual and augmented reality technology, like glasses and headsets, for their metaverse projects. Air-gapped visual data transmission technology was one of the final technological developments in virtual and augmented reality gadgets along these lines. However, new data protection issues may surface when more and more types of data are processed, like cryptographic security and privacy guarantees [18,19,20,21].

Therefore, we need a method to secure privacy-sensitive data sent to the optical carrier by selecting a strong encryption scheme. The Paillier cryptosystem, proposed by Pascal Paillier in 1999, is a probabilistic asymmetric algorithm for public-key cryptography [22]. We could provide VR systems with such confidentiality by using this scheme with strong encryption/decryption keys (ideally, as much as 1024 bits). Encryption of optical data on visual transmission should not adhere to the limited resources on the VR device.

Hence, we require a method to secure sensitive data sent to the user via the visible spectrum. To this end, this paper describes the architecture, component functionality, and design analysis of the open-source framework PRIVocular. (While the framework is built on the Paillier cryptosystem due to earlier work of the authors in [23], the framework can be used with any underlying cryptographic scheme.) The PRIVocular framework is a private VR hardware–software integration system that aims to visually encode data from several different defined optical representation methods, encrypt the encoded data, and then visually capture (transmit/receive on the visible spectrum) the image ciphertext. Finally, the framework should retrieve the initial data successfully via the reverse cycle of integrity-correct and confidentiality-preserving decoding and the corresponding decryption methods. PRIVocular possesses a client–server architecture approach and is based on the Android framework [7,9,10,11,13,24,25]. Encryption/decryption is performed only on VR devices with appropriate security priority constraints and manageable performance, with the Paillier cryptosystem utilized. The primary goal of PRIVocular is threefold: (1) performance (i.e., end-to-end transmission of the maximum possible amount of optical data), (2) integrity (i.e., reconstruction of original data, 100% accurate), and (3) confidentiality (i.e., privacy preservation of user data). Concerning the previously mentioned cryptosystem, successful decryption depends on the key size the end-to-end user defines for his/her data to be transmitted and retrieved correctly. The most novel design motivation behind PRIVocular is based on user-client detection, meaning only the user who possesses the correct key can ideally retrieve the visually encoded original data (thus, data content integrity ensurance).

This paperwork attempt marks the following highlighted innovative research contributions:

We prototype PRIVocular, an open-source framework (that works for any type of asymmetric/symmetric key encryption scheme) that aims to operate as a virtual proof of privacy, to establish strong cybersecurity constraints in vertical industrial-level applications (IoT), which demand extremely low latency requirements.
We integrate PRIVocular inside a metaverse-applicable immersive reality platform architecture.
We a priori design, implement, and incorporate a MoMAtag inside our framework. The MoMAtag contains 61% more capacity than the QR tag (version 40) [8].

There is a notable lack of literature examining the design of content copyright protection mechanisms and security–privacy concerns for AR objects within these AR interaction environments that can react to user actions, even though researchers have created user authentication systems for AR interactive environments [9,10,11,12,13,14,15,25,26]. In this research, we concentrate on possible ways to secure AR content’s copyright (for several industry use cases) and privacy protection. Our study is the first to directly address AR digital content protection, in contrast to earlier studies that focused on user identity verification in AR protection challenges. Through this study, we seek to advance a more thorough comprehension of AR protection, cybersecurity, and cryptographic security concerns.

The remainder of the paper is organized as follows. Section 2 briefly discusses some important preliminary topics and the most relevant work in the literature. Section 3 describes the implementation of the proposed system. Section 4 presents experiments and tests for the application, while Section 5 concludes the paper, together with future contributions to the paperwork.

2. Background

This section will discuss basic theories and technologies about optical data encoding formats, namely optical character recognition (OCR), the quick response (QR) code tag, and the Paillier encryption scheme. Finally, some relevant work on VR frameworks and applications will be briefly presented.

2.1. Tesseract OCR Engine

Optical character recognition (also optical character reader, OCR) is the electronic conversion of images of typed, handwritten, or printed text to machine-encoded text, whether from a scanned document, a photo of a document, or a digital image. There are currently many engines and techniques to perform OCR. One of the most prominent is Tesseract.

Tesseract is an open-source OCR engine developed at HP as a research prototype between 1984 and 1994. Tesseract architecture assumes that its input is a binary image with defined, optional polygonal text regions. The processing then follows a traditional step-by-step pipeline. The first step, which is the most computationally expensive, is a connected component analysis, in which the outlines of the components are stored. At this stage, several outlines are gathered together, purely by nesting into so-called Blobs. Blobs are then organized into text lines, and the lines and regions are post-analyzed for fixed pitch or proportional text. Text lines are immediately broken into words differently according to the kind of character spacing. The fixed pitch text is separated by character cells. Finally, proportional text is broken down into words using definite and fuzzy spaces [4]. A simple illustration scenario is depicted in Figure 1.

The recognition phase then proceeds as a two-pass process. In the first pass, an attempt is made to recognize each adjacent word. Each well-recognized word is passed to an adaptive classifier as training data. The adaptive classifier then has the opportunity to recognize text throughout the page segment more accurately. Since the adaptive classifier may have learned better next time, words not recognized well enough in the previous stage are recognized again. A final phase resolves fuzzy spaces and recaptures x-height coordinates to locate small-cap text [4].

The recognition of Latin script and typewritten text is still not 100% accurate, even with clear imaging. A study on the recognition of newspaper pages of the 19th and early 20th century concluded that character-by-character OCR accuracy for commercial OCR software varied from 81% to 99% [3]. Optical character recognition is not a computationally cost-free operation, especially for mobile devices such as smartphones and tablets. Because the computational cost of correcting errors dominates the optical document conversion process, the most important characteristic of an OCR device is accuracy.

Thus, several performance considerations have been suggested for the Tesseract OCR engine. Ref. [6] discusses a novel and cost-effective method for eliminating background images/watermarks to improve OCR performance. One well-known critical procedure in OCR is to detect text characters from a document image. To address this potential issue, the authors first enhance the document images before OCR by utilizing brightness and chromaticity as contrast parameters. Then, they convert color images to grayscale and threshold them. In this way, as they claim, background images can be removed effectively without losing the quality of recognized text characters. In another study [5], the authors emphasize the default lighting conditions, the position object tilt, and the camera focus settings as external parameters to increase the accuracy of the OCR text rather than redesign the Tesseract engine itself. However, the results, as they claim, improve by an amount of 5% more in recognition accuracy.

As mentioned above, the cryptographic performance of a VR framework, such as PRIVocular, is based on the visual accuracy of the captured information. Although an error-prone technique, OCR is the natural way to encode and transmit optical data. Thus, despite its potential negative effect on integrity, PRIVocular includes the detection of OCR as the baseline for privacy-preserving data transfer through air-gapped communication channels.

2.2. QR Code

QR (abbreviated from Quick Response Code) is the trademark of a matrix barcode (or two-dimensional barcode). A barcode is a machine-readable optical label that contains information about the item to which it is attached. A QR code uses four standardized encoding modes (numeric, alphanumeric, byte/binary, and kanji) to store data efficiently; extensions may also be used. A QR code consists of black squares arranged in a square grid on a white background, which can be read by an imaging device such as a camera and processed using Reed–Solomon error correction until the image can be appropriately interpreted. The required data are then extracted from the patterns present in the horizontal and vertical components of the image [8].

The amount of data that can be stored in the QR code symbol depends on the data type (mode or input character set), the version (1, ..., 40, indicating the overall dimensions of the symbol), and the level of error correction. Maximum storage capacities occur for 40-L symbols (version 40, error correction level L) [8]. The codewords are 8 bits long and use the Reed–Solomon error correction algorithm [27] with four levels of error correction. The higher the error correction level, the less storage capacity. Table 1 lists the approximate error correction capability at each level.

Figure 2 illustrates the structural analysis of the QR tag. The version parameter specifies the size and data capacity of the code. Versions range between 1 and 40, where version 1 is the smallest QR code and version 40 is the largest. If this parameter is left unspecified, then the content and error correction level will be used to guess the smallest possible QR code version that the content will fit inside.

Although the OCR technique often lacks optical accuracy and stable performance, QR is a stronger candidate for more time-critical machine-readable applications. Thus, PRIVocular further utilizes QR tags as an additional data transmission mode through air-gapped channels. Moreover, the framework includes custom extensions to the conventional QR tag specification to encapsulate more data while retaining the same error-correction accuracy standards.

2.3. Paillier Encryption Scheme

As mentioned above, PRIVocular uses Paillier as the underlying cryptographic engine. Any cryptosystem can be used with PRIVocular, as its main contribution is the encoding/decoding of visual data, not the cryptosystem utilized. Any other cryptosystem (e.g., AES) can be easily used in PRIVocular. Paillier has been chosen because partial homomorphic encryption schemes allow data manipulation without the cryptographic key. In the context of PRIVocular, the display source does not require the decryption key. It can still process encrypted data but not decrypt it. In particular, we pre-claimed that PRIVocular can, in practice, operate with any type of cryptosystem (symmetric private key and/or asymmetric public key). If we select a typical private-key cryptographic protocol, like the Advanced Encryption Standard, it is aforementioned that we always rely upon (both in public- and private-key modes) a trusted key server. As symmetric-key algorithms are cryptographic methods that use the same cryptographic keys for both plaintext encryption and ciphertext decryption, these keys can be identical or a simple transformation. They represent a shared secret between parties, used to maintain a private information link. Thereby, in the equivalent case instance of the private-key operational mode, we (1) always trust the key storage server, and (2) use a shared secret (QR code) between the two parties (can be the encryptor and decryptor user). A pre-shared key (PSK) is a shared secret that has been securely shared between two parties before it is needed for use. Thus, via this method, the key distribution is implemented by a QR code through secure communication channels from the trusted key server to the shared parties (e.g, the clients share the same key utilizing WPA-PSK, WPA2-PSK, and EAP-PSK protocols, ensuring secure communication and authentication). This section briefly discusses the Paillier cryptosystem to justify the performance overheads of encoding/decoding and encryption/decryption.

Let N be a cryptographic parameter equal to the product of two random primes, p and q. We consider m to be the plaintext and c the corresponding ciphertext. The Paillier encryption scheme is defined as a unique correspondence between a value c from

Z_{N^{2}}^{*}

and values m and r from

Z_{N}^{*}

and

Z_{N}

accordingly:

c = r^{N} g^{m} m o d N^{2}

(1)

where g is a generator in

Z_{N^{2}}^{*}

. In our encryption scheme, r is the probabilistic part, while m is the plaintext value to be protected [23]. Moreover, decryption requires knowledge of

ϕ

, which is the value of Euler’s totient function of N, and

n = p q

:

ϕ = λ = l c m (p - 1, q - 1)

(2)

Ensuring n divides the order of g is conducted by checking the existence of the following modular multiplicative inverse:

μ = {(L (g^{λ} m o d n^{2}))}^{- 1} m o d n

(3)

where function L is defined as:

L (x) = \frac{x - 1}{n}

. Thus, finally, we compute the plaintext message as:

m = L (c^{λ} m o d n^{2}) \cdot μ m o d n

(4)

Decryption is essentially one exponentiation modulo

n^{2}

[22].

2.4. Use Case Paradigm

Although PRIVocular can be applied and exploited in virtually any vertical industry, we aim to particularly target, in this research work, the eHealthcare/medical use case scenario, to visually obfuscate patients’ air-gapped privacy-sensitive user data (for instance electrocardiographs (ECGs)), and using the same manner in which PRIVocular operates, the corresponding on-the-fly encryption and decryption processes could be adhered to. Thus, the VR/augmented reality smart glasses could be used by medical healthcare personnel to visually underscope a critical (or not) patient’s medical history and current medical conditions, inside the hospital premises, via an ultra-fast (air channel), most secure, and content-privacy-preserving manner (PRIVocular framework).

It is well recognized that Internet-of-things (IoT) use in healthcare has grown significantly in recent years [28]. IoT and Big Data aim to address the major obstacles found in different eHealth applications. Cloud-based personal health record (PHR) systems are an emerging patient-centric paradigm of health information interchange and interaction that holds enormous potential for improving patient outcomes and assuring more effective healthcare system delivery. With the help of these cloud-based solutions, PHR users can share their health information with third-party data owners only after it has been carefully vetted and safely stored on semitrusted cloud service providers. It is difficult to outsource private information retrieval (PIR) to untrusted servers under remote on-demand cloud operations while maintaining database owner privacy [29]. PRIVocular’s particular use-case study examines its intrinsic applicability and efficacy for the most recently implemented public cloud-based Post-Quantum (PQ) PIR techniques. In parallel, PRIVocular can provide real quantum leveraged PIR techniques for privacy preservation among semi-honest but inquisitive parties (symmetrically private information retrieval systems) and sub-linear networking communication. Important ideas like searchable-symmetric encryption [30] and fully homomorphic encryption [31], which will fall under PIR’s purview, could also become potentially exploitable and industry ready through the PRIVocular framework. Last but not least, even the metaverse technologies have started to become more and more applicable to eHealth, even quite recently [32].

A patient’s health record includes health information regarding several medical specialties, including cardiology, oncology, dentistry, mental health, physical data summary, and so forth. Each application area may have data in various electronic formats, including X-rays, blood tests, surgical reports, and SOAP (subjective, objective, assessment, and plan) notes for medical conditions.

Patients may want to give their doctor access to their data, but they may not want other people (like a pharmacist) to have more sensitive information than necessary; ideally, the visually encoded original data can only be retrieved by the user who has the correct key, as mentioned in the Introduction. As a result, we can well comprehend how PRIVocular currently allows a patient to authorize access to particular cloud-stored elements of her/his health information. Patients are the only ones with access to encrypted EHR data at first. It would be necessary to perform homomorphic actions (such as adding the necessary properties to the ciphertext within the encryption domain) to allow users to access the data or retrieve specific portions of the encrypted material. For instance, the patient adds the type-identifier pharmacy, such as pharmacy-id, and pharmacy location (from the key-derivation procedure of PRIVocular, depicted next), and the medical prescription ciphertext if the patient wants a pharmacist to retrieve his prescription.

As illustrated in Figure 3, the cloud data storage services for EHR applications involve three entities: the Cloud User (U), the Cloud Server (CS), and a third-party auditor (TPA). The CS provides storage and processing power, while the TPA ensures the integrity of outsourced data while keeping user information private. For example, a pharmacist may be granted access to a patient’s electronic health record. The CS may operate nominally and follow protocol execution guidelines but may conceal data corruption or remove infrequently read files. The TPA party is considered trustworthy and impartial, with no particular desire to work with the CS or Cloud User during auditing. If the TPA can acquire outsourced data after the audit, it would be detrimental to the Cloud User. The Data Flow communication arrow could very well correlate to an air-gapped interconnection channel; thus, it is exactly where PRIVocular could contribute to the process.

2.5. Security Design Goals of the Medical Use Case

The protocol/framework design should ensure security and performance guarantees for cloud data storage under efficient cryptographic primitive schemes, allowing privacy-preserving public auditing. This is the first step towards private information retrieval.

Public auditability: allows third-party administrators (TPAs) to independently confirm the accuracy of data stored in the cloud without obtaining a copy or adding extra processing time for cloud users.
Storage correctness: ensures legal audits and keeps users’ data unaltered.
Privacy preserving: ensures sensitive user data content cannot be obtained or outsourced by TPA parties.
Batch auditing: allows multiple TPA parties to conduct secure audits concurrently from a large number of users.
The protocol should be lightweight, using minimal computational and communication resources.

The EHR-based cloud server system must provide users with protection and privacy, ensuring the confidentiality of health data even if the data server is compromised. The system reaffirms that the CS will not discover any information about the content of files from ciphertexts or keyword queries. As will be outputted and proven in the next sections, the PRIVocular framework can fully grant all these cybersecurity constraints.

2.6. Related Work

Extended reality (XR), a combination of virtual reality (VR), augmented reality (AR), and mixed reality (MR), has gained significant attention in recent years due to its ability to create immersive experiences by blending digital content with physical environments. This has led to transformative applications in healthcare, such as medical training, remote surgeries, and therapeutic interventions. Recently, there has been quite an extensive research effort to develop commercially available and open-source applications built into integrated VR-ready frameworks. These applications range from performing OCR for translation and text-to-speech conversion through a user interface interaction to image map recognition for geolocation services or even securing video streaming on smartphones. Next, we mainly focus on relevant VR-based applications, which utilize and process visually compressed data sources and any security constraints integrated with the Android API framework [24].

In [7], the authors have developed an Android application that combines the Tesseract OCR engine, the Bing translator, and the built-in speech-to-text technology of smartphones to perform text detection and translation from a captured image source. Using this application, the authors claim that travelers visiting a foreign country will be able to understand the messages portrayed in different languages. Finally, visually impaired users could access important messages from printed text through speech features.

From the perspective of the QR encoding technique, the authors in [9] build a framework using Java and Android that implements a digital signature technique for electronic prescriptions to prevent cybercrime problems such as robbery, modification, and unauthorized access. The prescription recipe is encoded into a conventional QR code and encrypted using an asymmetric algorithm. To decrypt the QR image tag with the recipe, a third user must log in to the Android part of the framework application to gain access to the public key. If the verification process is successful, the application will display the recipe encrypted from its corresponding QR tag. In [10], the authors suggest designing and implementing a two-factor identification authentication system using QR codes. Their system claims to provide another level of security, where the QR code acts as the first factor and the Android mobile system as the second. QR-TAN [26] is a smart authentication technique that relies on QR tags and smart cards to validate electronic transactions. QR-TANs authenticate transactions using a trusted device, such as a smartphone. Finally, Refs. [11,12,13,14,15] discuss relevant work in the field, where QR codes are being utilized to visually encode and transmit data correctly and provide cryptographic mechanisms for confidentiality or authentication purposes.

Ubic [1] is a perceptual framework based on head-mounted displays and computer terminals that perform a wide range of cryptographic primitives, such as secure identification, document verification using a novel physical document format, and content hiding. DARKLY [33] offers a privacy protection layer in untrusted perceptual applications over trusted devices. In [25], the authors propose an application for streaming video on Android smartphones that can capture video from a smartphone camera and then send it to the computer in real time. The contribution is that the video can be secured by the selective encryption of critical data in the video. Selective encryption only selects important parts of the video that will be encrypted. Finally, a general framework for the development of multiplayer augmented reality-based adventure games is proposed in [34]. The framework can create online treasure hunt or scavenger hunt games for mobile devices, that is, Android-based. It offers integrated image recognition support combined with GPS-based localization. Specifically, image recognition is used to determine the exact location of the player, and then, a picture is displayed in augmented reality mode. Perhaps one of our solutions’ most recent and similar research efforts is [35]. Hereby, when visual tracking is enabled, a novel visual cryptography technique is used that is tolerant to users’ head motion and slight misalignment of the two shares of encrypted visual information. However, the scheme is not very scalable because it only relies on generating a two-share scheme, i.e., a one-time pad, and it remains ad hoc only for this sole purpose, despite using virtual reality. Finally, the author(s) in [36] extend the Physical Unclonable Function (PUF) and Virtual Proof (VP) sensors to prevent key exchange attacks based on hardware implementation rather than number theory. They claim that the novel key exchange methodology is developed and demonstrated using experimental data based on a Virtual Proof of Reality.

The widespread use of XR technologies has raised cybersecurity and privacy concerns, posing significant risks to user trust and data security. These systems collect extensive user data, including behavioral, biometric, and locational information, which can be accessed unauthorizedly, leading to identity theft, impersonation, and exploitation of sensitive information [37]. XR applications are particularly vulnerable to attacks such as eavesdropping and data interception. Cybersecurity threats in XR applications include data interception, biometric data risks, and location privacy. Continuous connectivity to servers exposes XR applications to cyberattacks like DDoS and man-in-the-middle attacks, leading to data breaches [38]. Biometric data, such as eye tracking and gesture recognition, can reveal behavioral patterns and health information that can be exploited by malicious actors. Spatial mapping data can inadvertently disclose (medical) users’ physical environment, posing significant risks to location privacy. Emerging threats, such as deepfakes and synthetic media, pose challenges for XR security, as adversaries could create realistic avatars or manipulate content to deceive users, leading to fraud or harassment scenarios.

The Data Privacy perspective is perhaps one of the most crucial. XR, or augmented reality, is a technology that relies on extensive data collection to enhance personalization and realism. This data collection often involves sensitive information such as personal identifiers, behavioral patterns, biometric details, and location information. However, some XR platforms lack robust security controls, increasing the risk of data leakage. This can lead to breaches, undermining user trust and deterring the adoption of XR [39,40].

A key privacy threat in XR systems is the leakage of sensitive user data, such as personal identifiers, biometrics, and behavioral patterns. These data can be exposed to unauthorized third parties due to inadequate encryption during data transmission. Data leakage risks are particularly pronounced in applications where data are stored on cloud-based servers without strict access control. As XR expands across industries, cross-platform data sharing heightens these risks. Biometric data are the most vulnerable type of data in XR environments. Current security approaches may not be sufficient to address these sophisticated threats [20].

To address the limitations of existing SoTA works, that is, the lack of IoT flexibility, cybersecurity scalability, and interoperability of security protocols, the unique contribution of the proposed PRIVocular framework is that it focuses on purely privacy-preserving information transfer through automatic detection of ciphertext data from display sources, providing a seamless experience to the user.

3. The PRIVocular Framework

3.1. Cryptographic Properties

To make a practical contribution, as discussed in the Introduction section, we mainly build a resource-friendly graphical user interface and an easy-to-use system that can be deployed into the current Android-based mobile infrastructures [24], even for the eHealthcare vertical use case. From a technological perspective, PRIVocular offers the following (overall) security functionalities and novelties:

Authentication. PRIVocular’s generic encryption scheme (Paillier) includes a cryptographic key generation and distribution phase. The framework realizes this technique through a high-definition camera from smart mobile devices and a computer node (key server). The previous process allows users to authenticate before creating, encrypting, sharing, and decrypting any type of written text, or EHR data. Specifically, the encryption and decryption keys produced during this phase are encoded as QR tags so that no eavesdroppers could potentially read them (blind eye) apart from the authenticated users themselves. At this point, it is implied that the user(s) that will generate the keys (public and private) are the same user(s) that will eventually be able to obtain the plaintext through PRIVocular.

Content Hiding/Copyright Data Protection. We provide an end-to-end solution to ensure privacy through air-gapped transmission channels. Rather than projecting the ciphertext(s) and public/private keys on the screen as is, we encode them into a hexadecimal format and concatenate them so that no human user could use brute force to locate each encrypted group segment that belongs to a single ciphertext character. Only the authenticated user(s) possessing the required decryption key can perform the ungrouping successfully.

Efficient Data Encapsulation method(s). PRIVocular offers three different data encapsulation and transmission methods: OCR, QR-based, and custom-QR-based (MoMAtag). Each specific method possesses different design specifications and performance efficiency; however, for the proof-of-concept, we include a comparative performance study and comprehensive visual analysis for all in Section 4.1.

OCR Filtering Intelligence. Another novelty of the framework, within the OCR mode of operation, is its software capability to distinguish between encrypted and non-encrypted text on computer displays. For instance, PRIVocular’s OCR engine can separate the ciphertext and nonencrypted text and produce the resulting plaintext nominally.

MoMAtag. We have expanded the conventional storage amount properties of the traditional QR tag version 40 to fit almost double the previous size for our encryption needs. This was particularly driven by the incremental size of the ciphertext data on the screen while the key size increased.

3.2. PRIVocular’s Architecture

PRIVocular is a VR/XR framework that allows end users to choose between three visual data encoding techniques. Both the application’s client- and server-side parts have three operation modes. The data are encoded using one of the three encoding types, to be consequently encrypted and visually captured on conventional computer displays by a VR device UHD camera. Finally, decryption decoding is performed on the server side. The reverse cycle of encoding–encryption–transmission–decoding–decryption (see Figure 4) depends on the underlying cryptographic scheme. Specifically, if the end user has not obtained the correct encryption/decryption keys, she/he cannot generate the correct ASCII-typed characters.

Thus, the goal of confidentiality should be satisfied by construction in the PRIVocular framework. Integrity and performance would be the most challenging parts.

The PRIVocular system framework, as in Figure 5, consists of the following (hardware) key elements: (1) Prompting displays could range from wall projector sources, UHD monitors, and LCD computer displays to smartphone device monitors. The particular display sources serve the key role of displaying the end-user-typed characters and recognizing or detecting all the visually encoded ASCII characters from the server-side part of the application. (2) A client-side smartphone, or tablet, that will allow the user(s) to type or input any ASCII character, as keystrokes, to PRIVocular. (3) A server-side smartphone, tablet, or VR smart glasses, equipped with a UHD camera, which can capture the visual encoding format and perform the decoding and decryption cycles to retrieve the correct data.

PRIVocular consists of the following (software) key element technologies: (1) Key Generation and Distribution Server. A third node in the system (hardware node) generates the public/private cryptographic keys for encryption/decryption. The key server is written in Java and could be deployed with the prompting-display computer node. We assume that the corresponding (always trusted) key storage server has enough computational and memory resources for tackling issues such as key rotation, secure storage, and disposal, particularly within the framework of a public-key cryptosystem, regardless of the public key size. (2) Android-based API for both the client–server software implementation part of the framework.

PRIVocular encompasses the two following information transmission methods: (1) Bluetooth wireless technology, to transmit the keystrokes from the client part of the framework to the prompting devices, which are, of course, connected to a Bluetooth adapter, and (2) visual transmission, i.e., images and optical characters are transmitted on the visible spectrum (captured by smart device cameras).

Finally, the three modes of operation for PRIVocular, both sides, end-to-end, are: (1) optical character recognition, OCR, which is of course achieved by displaying optical characters in Base16 format, in the prompting displays, (2) MoMAtag, which is a customized QR-code tag based on the Quick Response Code (see Section 2.2), able to fit double the size of the QR version 40, and (3) hybrid, which is based on a grid layout of conventional QR codes, with each QR dedicated to a sole encrypted ASCII character, any version from 1⋯40, depending on the size of the previous individual ciphertext.

Figure 5 also depicts the PRIVocular processing and interaction pipeline, end-to-end, for the different encapsulation modes. It is worth mentioning, at this point, that the framework can scale to multi-party environments, i.e., multiple authenticated users with the same public/private key, as long as they are all trusted parties.

3.3. eHealth Use Case Design Applicability

PRIVocular’s main goal is to generate and optically capture–reconstruct the maximum amount of ASCII characters typed by end-to-end medical user(s) (patients and doctors). Because we select transmission on the visible spectrum, and due to the limited spacing of characters on any displays, there is always a significant balance between ocular performance and accuracy. As mentioned earlier, accuracy or integrity is vital for successful encryption/decryption. Due to the mathematical properties of the Paillier cryptosystem (i.e., uniqueness of plaintext–ciphertext space mapping), it is crucial to reconstruct or decode the original user data from any mode of operation with a 100% error-free state. Room lighting conditions can significantly affect visual performance as well. External factors, such as room brightness, visual distortion or reflexes on the screen, color interference from different lighting sources on the display, or even the physical distance between the smart device’s camera and the monitor, play a key role. Another critical factor, especially for the QR/MoMAtag cases, is a malevolent user trying to maliciously alter or destroy the image tag to forge the integrity part of the framework. By any means, if data are decoded erroneously, either due to deliberate or natural causes, decryption will instantly fail.

Thus, it is important to introduce error-correction-detection codes for all cases or modes, like in the MoMAtag and hybrid methods. Still, in the OCR case, we only allow raw data to be encoded and displayed without any error correction for the proof of concept. The PRIVocular framework has been explicitly designed to perform source detection (input visual content from multiple sources) and client detection (to which client to explicitly share visual content, like visual shared keys). For the first case, the framework can adapt to any custom display capabilities, i.e., detect the custom display resolution and graphic capabilities, and based on those specifications, decide the optimal display parameters (font size, zoom level, orientation) of the optical characters or QR tags. For the second, PRIVocular functionality corresponds only to the end client that carries his/her specific encryption/decryption key. We could impose that authentication is another derived goal of PRIVocular, meaning only the correct key-holding client can successfully derive the original plaintext. Furthermore, because the software part of PRIVocular runs on smart devices’ client hardware, the framework possesses a minimalistic design approach. The system framework prioritizes security constraints as a top priority and manages performance.

To conclude this subsection, a typical applicability case scenario for PRIVocular (see Figure 6, which involves the usage of Metaverse/smart glasses), based on Figure 5 and Figure 7, follows the next sequence of events:

Key Generation and Distribution phase
1.
The end user inputs his/her desired key size in the (software-based) key server’s input prompt.
2.
The key server creates the public (encryption) key QR in addition to the private (decryption) key QR based on random prime numbers p and q, each time, for the Paillier cryptographic scheme.
Data Encapsulation and Encryption phase (client side)
3.
The client-side application of PRIVocular reads the public key QR to generate the encryption parameters.
4.
The end user then selects a visual encoding technique, or data representation method, among OCR, hybrid, or MoMAtag from the software GUI.
5.
The end user can now start typing ASCII characters from the client application. Encryption is performed in stream mode (i.e., the ciphertext is produced per character typed on the fly).
6.
Through the Bluetooth communication interface, each character typed is sent to the prompting device screen via a Bluetooth (server) adapter. The previous implies that all PRIVocular devices should be paired via Bluetooth and become synchronized. The process depends on the data encapsulation mode as follows:
- Hybrid mode: Each time a character is typed, a QR code is shown on the prompting screen in a grid layout, with the corresponding ciphertext of the typed character as its content.
- MoMAtag mode: Encryption is performed in block mode, meaning after the user has typed an ASCII character, she/he will generate the MoMAtag with its ciphertext, a posteriori.
Data Decoding and Decryption phase (server side)
7.
The server-side application of PRIVocular reads the private key QR by the same authenticated end user to generate the decryption parameters.
8.
The end user will then select the applicable mode of operation, depending on their generated or encoded form of plaintext data.
9.
The end user, provided she/he has obtained the correct decryption key format, can now successfully decode and decrypt, thus reconstructing the original message on the smart device’s display.

3.4. PRIVocular Key Components

Inside this section, we present, in more detail, the key server features and functionality for the PRIVocular framework, as well as the three modes of operation, or data encapsulation means: OCR, MoMAtag, and hybrid.

3.4.1. Key Generation and Distribution Server

The key server is a software-based key generation and distribution component for the necessary encryption and decryption procedures of the Paillier cryptographic scheme we are utilizing in PRIVocular. It mainly accepts only input parameters from the end user, the key size, which should be a power of 2, from 32 bits to 4096 bits, for the whole range of the framework functionality. The process of key generation and distribution is directly illustrated in the next figure (Figure 8).

The key server inside the scope of PRIVocular’s functionality is considered a fully trusted server. It should be emphasized that no cryptographic keys are being stored in the non-volatile memory of the server, and all cryptographic primitives being produced are being placed only in the RAM. Encryption and decryption keys are outputted as QRs, only displayed on the screen terminal as UNIX non-printable ASCII characters (black and white stripe blocks), without being saved or stored as image files [41]. Thus, after a few seconds, the QRs are flushed from the terminal screen regardless of whether the end user has scanned them.

As discussed in Section 2.4, the essential cryptographic parameters to construct the public (encryption) key are the pair

(n, g)

, whereas to derive the private (decryption) key, the user needs

(λ, μ)

[22]. Upon user (key size) input, the key server generates two random prime numbers based on the key length in bits and produces the two corresponding pairs (encryption and decryption) to be embedded inside the two key QRs. It is worth noticing that even if the end user inputs the same key size in bits each time, the encryption and decryption keys remain different due to the secure randomness of operations. Thus, successful decryption would be impossible each time, even with the same key size. Figure 9 demonstrates the content data, in packet illustration format, for the encryption and decryption key QRs.

Finally, as easily depicted in Figure 9, we include a new packet header inside the two key QRs’ content data, which is named maxgrouplength. This parameter is not an actual cryptographic primitive essential for any Paillier encryption or decryption calculations; it mainly groups or separates into ciphertext segments all the encrypted ASCII characters (encoded in Base 16 format) in the OCR mode. The purpose of grouping is a vital process for post-decoding OCR decryption. As described in the next section, when successful OCR recognition of all the individual OCR ciphertexts (s) takes place, the application engine requires information on the correct group length for all ASCII encrypted characters based on the decryption key parameters to decrypt them. Thus, maxgrouplength is directly derived from the post-encryption process. It aids the application algorithm in segmenting and matching the OCR result text, which it decodes into the corresponding sole ASCII encrypted characters. Without the grouping parameter, locating each ciphertext, and/or finding the total number of ciphertexts, or even performing decryption, would be challenging.

The method by which maxgrouplength is calculated by the internal functionality of the key server is as follows: Based on the encryption key parameters, already computed and derived, the key server performs an «internal» encryption between the minimum printable ASCII character and the maximum printable ASCII character. The server will then be able to find the maximum hexadecimal length needed for any one of the output ciphertext characters based on sequential comparisons. The maxgrouplength parameter should now correspond to any encryption length range of the printable ASCII characters, provided they are encrypted with the same key parameters. Finally, to cover the case when the produced ciphertext might have a shorter length than the previous parameter (maxgrouplength), we perform zero padding in the most significant HEX digits, with the necessary number of zeros.

3.4.2. OCR

Optical characters are the first mode of visual data encapsulation for PRIVocular. Figure 10 illustrates the OCR process from data generation, encryption, encoding, encapsulation, and vice versa. Once the end user types an ASCII character from the client-side part of PRIVocular, the ASCII character is zero-padded, based on the maxgrouplength parameter, then converted to the decimal numeral system to be encrypted by the Java-based Paillier library, and finally converted to Base 16 for the OCR representation method. Since transmission is performed on the visible spectrum, OCR images are captured and analyzed by the UHD camera of the server-side part. The OCR text is first recognized by the OCR engine; depending on the physical distance, lighting conditions, and various other optical parameters, successful recognition is not always 100% guaranteed, but it is a vital process for successful decryption to proceed.

The OCR procedure does not follow any error-correction-detection technique; thus, we have to introduce a confidence metric for the level of text errors introduced. For some OCR applications, like PRIVocular, it may be important to know the reliability of the recognized text generated by the engine. The confidence metric, or mean confidence property, expresses the certainty of the character recognition and ranges between 0 and 100 [42]. A value of 100 means that the engine recognized the character with high confidence. Applications that examine character confidence information can use a threshold value. As demonstrated in Figure 11 below, the value of a character is treated as a suspicious result. Based on experiments, we identified that a value of 64 is best for this purpose. A value of 64 or more indicates high confidence that the character was recognized correctly. A value less than 64 marks that code as suspicious.

One main contributing functionality of PRIVocular for the OCR mode is the capability of the OCR engine to detect, filter, or isolate encrypted and non-encrypted text. In most real-case scenarios, encrypted ASCII characters, represented in Base 16, can be confused with normal ASCII characters, numbers, computer graphics on screen, etc. Ideally, we would prefer to depict only the ciphertext(s) on the display. Still, very often, even a single character or digit not belonging to the ciphertext set could easily ruin our decryption process. That is because the OCR engine searches for the whole device’s camera capture area to detect optical characters. The recognition effort might include several other ‘foreign’ digits, even in that range. Thus, as described by the below Algorithm 1, the PRIVocular OCR engine has extra intelligence to distinguish between ciphertext(s) and non-encrypted text in the same visible area.

The key idea of the filtering algorithm is to extract the ’purely’ hexadecimal digits from all OCR text that the engine detects on screen. That would eventually limit the results significantly to the scope of isolating only the Base 16 ciphertext(s). Still, though, even at this step, the algorithm might (wrongly) input decimal digits (0–9) that correspond to Base 16 or even some ASCII characters, i.e., Aa, Bb, Cc, Dd, Ee, Ff, that would be mistakenly considered to belong to the previous correct set. For that purpose, the algorithm performs a sliding window search, whose size is the maxgrouplength parameter (see Section 3.4.1), for all the Base 16 filtered OCR text to check a priori if the corresponding groups of HEX values can be decrypted or not. Thus, for each elementary HEX group, if decryption succeeds, the application will immediately show its plaintext value in the correct XY coordinates on the screen where it specifically appears. If decryption fails for any group, this search will dynamically continue on the next OCR input text result. At this point, the previous process occurs on the fly, which means that the application will show any plaintext character that could be decrypted, regardless of the rest (see Figure 11).

Algorithm 1 Encrypted OCR Filter Algorithm.

1:: procedure main
2:: $maxgl \leftarrow maxgrouplength$
3:: $OrigOCR \leftarrow OCR_TextResult$
4:: $HEX_OCR \leftarrow extractHEXchars (OCR_TextResult)$
5:: for ( $i = 0$ ; $i < = (s t r i n g l e n o f [HEX_OCR] - m a x g l)$ ; $i + = 1$ ) do
6:: $i d x_s t a r t \leftarrow i$
7:: $i d x_e n d \leftarrow (i + maxgl)$
8:: $ciphertextgroup \leftarrow HEX_OCR . SubString (i d x_s t a r t, i d x_e n d)$
9:: $ctg \leftarrow ciphertextgroup$
10:: $r e s \leftarrow d e c r y p t (ctg)$
11:: if ( $r e s i s A S C I I$ ) then
12:: // Decryption successful
13:: $pre_decrypt_results \leftarrow a p p e n d (r e s)$
14:: if $ctg \in OrigOCR$ then
15:: $p o s_i n d e x \leftarrow OrigOCR . F i n d I n d e x O f (ctg)$
16:: $decrypt_res_coordinates \leftarrow a p p e n d (p o s_i n d e x)$
17:: end if
18:: end if
19:: end for
20:: $postOCR (pre_decrypt_results, decrypt_res_coordinates)$
21:: end procedure

3.4.3. MoMAtag

The main contribution of MoMAtag (MoreMAssivetag) is that it fits (almost) twice the size of the conventional QR code Version 40 in binary data encoding mode and with a high ECC level. It is an extension of the traditional QR code tag to a maximum size. We constructed MoMAtag design specifications by expanding (1) the data_capacity, and (2) Error Correction Code Words and Block Information tables of the conventional QR [8]. Version_size, position_adjustment, and version_pattern tables were not altered. Furthermore, to increase the effectiveness of the Reed–Solomon error correction and detection Algorithm 1 for such a larger data capacity QR tag (MoMAtag), we introduced new values for the generator polynomial, as well as the log and antilog values used in the algorithm GF(256) arithmetic [27]. The following table (Table 2) illustrates the comparison parameters between traditional QR Version 40 and MoMAtag.

MoMAtag has been pre-selected, as a design assumption, to host one ASCII character encrypted with a 4096-bit key size. Thus, such a ciphertext string size (2048 HEX characters) would be impossible to fit inside the QR-code Version 40H, but with the MoMAtag, it is possible, combined with a high ECC level. It would be possible to increase the size of MoMAtag by re-modifying the previous table parameters. That way, we could fit more ciphertexts (s) with even larger key sizes.

Figure 12 shows a MoMAtag. Figure 13 demonstrates the MoMAtag mode of operation, with the corresponding ASCII character (x) detected, decoded, and decrypted on the smart device screen.

3.4.4. Hybrid

Finally, we discuss the hybrid mode of operation for the PRIVocular framework, end-to-end. The main idea behind this operation mode is to illustrate on the prompting device one conventional QR code (Version 1...40) dedicated to every typed ASCII character from the client side. Each QR code contains the ciphertext of the typed character, again with Base 16 encoding format. The QRs are displayed in a grid-style layout; in the same fashion, a user types normal letters to form a sentence. Although PRIVocular’s hybrid mode engine decrypts what it sees on screen (WYSIWYG), it is not possible to use brute force on the total length of the plaintext because the terminal screen buffer might have more QR tags stored already from before, i.e., previously typed characters.

The hybrid mode for PRIVocular works for all user-determined key sizes from 32 bits to 2048 bits. It is worth noticing that even with a key size of 4096, one encrypted ASCII character could easily fit into a QR version 40, but with a low ECC Level. Thus, we decided to utilize MoMAtag and separate the hybrid and MoMAtag modes of operation, although they are both QR-based.

PRIVocular deploys the ZXing QR decoding libraries to detect and decode QR codes. ZXing (“zebra crossing”) is an open-source, multiformat 1D/2D barcode image processing library implemented in Java, with a port to other languages [43]. The PRIVocular implementation part, which is Android-based, modifies the previous libraries to further perform decryption in bulk mode, as well as to represent the original plaintext ASCII characters with their exact XY corresponding QR-specific coordinates on the device screen. The bulk mode software option capabilities of the ZXing QR engine facilitate the operation of bulk QR decoding.

Finally, QRs are displayed directly on the UNIX terminal screen as non-printable ASCII characters rather than image files [41]. The particular design criterion allows easier code manipulation, i.e., remove a QR when a user presses Backspace to remove a typed ASCII character. Furthermore, it is crucial to massively control the size of the displayed QRs, which changes on different key sizes; thus, by manipulating the QRs as String variables in the terminal, it is more practical to alter their display dimensions, i.e., by changing the terminal font size, to increase performance.

Figure 14 shows the hybrid mode of operation, where the grid layout (plaintext) display mode is presented.

4. Experimental Results and Analysis

4.1. Experimental Analysis

The experiment setup (for both analysis and final results) consists of utilizing a smart device (i.e., smart glasses or smartphone) camera (HD 720p) positioned at a stable viewing distance (approximately 43 cm) from the prompting device node. This node consists of a desktop PC monitor (with 1920 × 1080 native pixel resolution). The lighting conditions were considered as the default, i.e., at normal room lighting. There were no visible obstacles between the camera and the source. Conditions such as the viewing angle and line of sight were not considered and will be explored in future work. An example of the experiment’s functionality follows in Figure 15.

The (pre-final) visual experimental setup consisted of capturing two image types:

QR Low. A low-resolution QR Tag (372 × 359 pixels)
Text/OCR. A medium resolution Text-on-Screen file (1084 × 584 pixels)

To analyze the image statistics of each image type, the skimage Python library was utilized, and additional analysis on various statistics, i.e., number of pixels per color channel, image entropy, and various histograms, was also performed. The goal of this initial setup is to understand which is the most efficient optical data transmission type through the visible spectrum. We also investigate image entropy. In information theory, information entropy is the log-base-2 of the possible outcomes for a message. For an image, local entropy is related to the complexity of a given neighborhood, typically defined by a structuring element. The entropy filter can detect subtle variations in the local gray level distribution.

4.1.1. QR Low

In the first example, the image comprises two surfaces with slightly different distributions. The image has a uniform random distribution in the range [

- 14

, +14] in the middle of the image and a uniform random distribution in the range [

- 15

, 15] at the image borders, both centered at a gray value of 128. We compute the local entropy measure to detect the central square using a circular structuring element of a radius big enough to capture the local gray level distribution, as shown in Figure 16. The second example shows how to detect texture in the camera image using a smaller structuring element.

The histogram in Figure 17 is interpreted as follows: The bins (0–255) are plotted on the x-axis, and the y-axis counts the number of pixels in each bin. The majority of pixels fall in the range of 230 to 255. Looking at the right tail of the histogram, we see almost every pixel in the range 200 to 255. This means many pixels that are almost ‘white’ are in the image. Based on this initial research, the preliminary conclusion is that the QR tag can be decoded successfully and error free from the visible spectrum almost instantly (Figure 18). Thus, it appears to be the most efficient means of optical transmission.

4.1.2. Text/OCR

Again, we compute the entropy parameters for the OCR text file. The results are depicted in Figure 19. The figure shows that the entropy range now appears more restrained than the QR case. However, if we observe the histogram illustration (Figure 20), we notice that this time, the bins on the x-axis are more widely distributed. Thus, there is a wider majority of pixel range than the ‘black–white’ case of the QR. This diversity of the pixel colors, together with the particular image entropy or gray level distribution, implies that the OCR text cannot be visually detectable in an error-free state. Thus, it initially seems from the plots that OCR is not the most effective means of ocular data transmission.

4.2. Performance Evaluation of PRIVocular Framework

For the scope of the main evaluation analysis of the PRIVocular framework, we have conducted two sets of experiments: the first with a vertically oriented prompting PC display screen and the second with a horizontal or landscape orientation. For each case scenario, we have utilized the same smartphone device equipped with a UHD camera (16MP, 1080p) and the corresponding orientation mode (portrait or landscape).

Next follows the first results table (see Table 3). This visual analysis table corresponds to the vertical orientation mode for both end-to-end displays. The table includes all three input modes of operation, i.e., OCR, MoMAtag, and hybrid. The aim of this experiment set, as well as the second, would be to ideally capture and successfully retrieve the maximum amount of optical data on screen while maintaining several other technical parameters such as physical distance, visual space of data on screen, and encoding types as observational dependent variables. We pre-selected running this setup with a 32-bit key size for OCR/hybrid modes. The MoMAtag mode is dedicated to a 4096-bit key size.

In Table 3, the second column, named Num. of elements, matches the number of HEX characters that can be displayed, captured, detected, and decrypted from the prompting screen. Of course, this number can be divided by the maxgrouplength parameter to uncover the total number of ASCII characters optionally recognized. The third column (i.e., Max. characters) is the maximum amount of (byte) data that can be successfully decoded and decrypted in the corresponding input mode. Finally, the column named Square Pixels Per Element could be considered as a visual space calculation metric for needs, i.e., to estimate the space (width #pixels X height #pixels) for each (corresponding input mode) individual element at the desktop PC monitor.

Obviously, each mode of operation performs differently, simply because, due to the specific optical encapsulation method, QR-based encapsulation, like in the MoMAtag/hybrid cases, seems to be able to fit more raw data information, whereas it decodes and decrypts faster (due to ECC presence) than OCR. Other factors, such as the image entropy characteristics of an OCR-text file compared to a QR image tag, as depicted in the previous subsection, contribute to this argument.

Using Table 3, we compare the OCR-based and QR-based representation methods. Although OCR appears to consume less visual space on the screen per individual element and successfully processes 1200 HEX elements, or 600 bytes, it is the least effective means of optical transmission. To justify the previous argument, it is easily observed that QR-based methods can capture more byte data, even while consuming more area space on the display screen. Thus, we can conclude that MoMAtag appears to be the most effective method of ocular encapsulation and transmission among the QR-based techniques and, as a total, from the first set of experiments.

For the second set of evaluations, we conducted the same setup with the horizontal orientation mode for both devices and derived more technical parameters as output results. Table 4 matches the OCR operation mode, while Table 5 corresponds to QR-based techniques (that is, MoMAtag/hybrid).

In this setup, we maintain the key size (ranging from 32 to 4096 bits) as an independent variable and analyze six dependent parameters derived from the experiment. We mention at this point that for the OCR case, we have introduced the mean confidence metric (see Section 3.4.2), as well as the time required for the engine to perform OCR detection (the time metric is internally computed from the Android application environment). The latter applies to QR-based methods as well, except for the mean confidence metric. The reason is that the deployment of OCR in PRIVocular does not encompass any ECC method. In contrast, the hybrid/MoMAtag methods have embedded error-correction codes inside their specifications (Reed–Solomon). Finally, physical distance is the ideal distance between the device camera and the prompting display for successful recognition.

Table 4 shows that the OCR does not become functional for key sizes larger than 1024 bits. An individual ASCII element encrypted with a key size greater than 512 bits consumes much space on the screen. Therefore, it depends on at least 512 HEX characters for successful decryption; if one is misdetected by the OCR engine, the whole decryption process fails. The same table shows that the number of recognized ASCII characters is minimized as the key size becomes larger towards 512 bits.

Table 6 summarizes the basic security properties, performance comparison, and various parameters from several studied PIR techniques on IoT eHealth cloud storage, including collating with the PRIVocular framework.

As the key size grows, each ciphertext group contains a bigger (segmented) HEX length, requiring more visual space. Less useful byte data remain visible on screen, so the time to decode/decrypt becomes shorter. To enhance the previous argument, we can easily understand why, for the same reason, the mean confidence becomes higher as the key size becomes larger.

Finally, we analyze the next table of our evaluation steps (Table 5). By only considering and comparing the number of recognized ASCII characters between OCR and hybrid, it seems that OCR outperforms QR-based techniques at first glance. That is not an apparent and correct assumption, especially considering all further technical parameters. Although QR tags consume at least 25% more visual space, for the 32-bit key size, and as mentioned before, recognizing 50% less ASCII characters, still, OCR remains an erroneous technique; thus, its detection time is tens of seconds with erroneous performance. In contrast, in QR tags, the time to retrieve data is instant and the performance is error free, with maximum availability.

The hybrid mode, with QR tags of different sizes in each key size case, has the same smooth performance for all ranges of key sizes (32 bits to even 2048 bits). The case of 4096 bits, as mentioned earlier, although it could be practical for a conventional QR tag, is solely dedicated to MoMAtag, with high ECC.

5. Conclusions and Future Work

The PRIVocular framework explores the use of consumer smart devices for VR interaction inside an integrated software environment that allows the encryption, encoding, and visual encapsulation of data inside the medical world. The reverse cycle of plaintext information retrieval mirrors the ciphertext display in most operation modes, except for OCR, which functions erroneously. Robust cryptographic security, reliability, and strong privacy preservation in air-gapped communications are practically attained for IoT devices. Thus, we achieve the lowest possible encryption-key-exchange-decryption visual latency (less than 1000 ms) and optimal user flexibility (physical distance from ciphered visual elements and human avatar user around 40 cm). Here, we have presented the general functionality and design components of PRIVocular. We have also presented several key novelties and contributions inside our framework functionality, such as the ‘expanded’ QR tag, MoMAtag, and OCR ciphertext filtering capabilities. Finally, we conducted extensive evaluation and analysis experiments to test PRIVocular in real-case scenarios. The results demonstrate that QR-based techniques are more efficient than OCR-based encapsulation for on-the-fly image decryption.

The PRIVocular framework could be further enhanced in various aspects. Initially, OCR with error correction detection could be introduced to increase the detectability and accuracy of optical ciphertext characters. In that case, information denoising could be improved in the (visual) decryption. Furthermore, MoMAtag specifications could be altered to host more encrypted raw data so that the functionality of PRIVocular would work for even larger key sizes. Perhaps we could even allow a grid layout of MoMAtags to be detected in bulk mode. Different encryption/decryption schemes could also be deployed, including symmetric key cryptosystems. Finally, we did not perform an investigation into cryptanalysis against XR security-related attacks because we rely on the cryptographic strength of well-utilized asymmetric schemes; rather, we focused mainly on the visual quality of the air-gapped channel between the source and receiver.

The metaverse’s disruptive nature offers benefits, but traditional security solutions may be ineffective due to its immersiveness, hyper spatio-temporality, sustainability, interoperability, scalability, and heterogeneity. This challenges fast service authorization, compliance auditing, and accountability enforcement. The large-scale metaverse’s virtual worlds pose significant interoperability challenges. Privacy and security are key, with two-factor avatar authentication and data protection critical [45]. In conclusion, as proven above, PRIVocular is a holistic and technologically most innovative VR-ready platform that preserves privacy and can fully meet these extremely strict security constraint requirements of the metaverse/eHealthcare.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

The author would like to thank Nektarios Tsoutsos and Anastasis Keliris, both members of MoMAlab (NYUAD), for their considerable efforts and helpful approach to producing this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

OCR	Optical character recognition
QR	Quick Response Code
ECC	Error Correction Code

References

Simkin, M.; Schröder, D.; Bulling, A.; Fritz, M. Ubic: Bridging the Gap between Digital Cryptography and the Physical World. In Computer Security—ESORICS 2014; Kutyłowski, M., Vaidya, J., Eds.; ESORICS 2014, Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2014; Volume 8712. [Google Scholar]
Reality 51 Team Managed by Łukasz Rosiński, The Farm 51 Group S.A., Report on the Current State of the VR Market. 2015. Available online: http://thefarm51.com/ripress/VR_market_report_2015_The_Farm51.pdf (accessed on 18 April 2025).
Holley, R. How Good Can It Get? Analysing and Improving OCR Accuracy in Large Scale Historic Newspaper Digitisation Programs. D-Lib Magazine. April 2009. Available online: https://www.dlib.org/dlib/march09/holley/03holley.html (accessed on 5 January 2014).
Smith, R. An Overview of the Tesseract OCR Engine. In Proceedings of the Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), Curitiba, Parana, 23–26 September 2007; pp. 629–633. [Google Scholar] [CrossRef]
Mantoro, T.; Sobri, A.M.; Usino, W. Optical Character Recognition (OCR) Performance in Server-Based Mobile Environment. In Proceedings of the 2013 International Conference on Advanced Computer Science Applications and Technologies, Kuching, Malaysia, 23–24 December 2013; pp. 423–428. [Google Scholar] [CrossRef]
Shen, M.; Lei, H. Improving OCR performance with background image elimination. In Proceedings of the 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), Zhangjiajie, China, 15–17 August 2015; pp. 1566–1570. [Google Scholar] [CrossRef]
Ramiah, S.; Liong, T.Y.; Jayabalan, M. Detecting text based image with optical character recognition for English translation and speech using Android. In Proceedings of the 2015 IEEE Student Conference on Research and Development (SCOReD), Kuala Lumpur, Malaysia, 13–14 December 2015; pp. 272–277. [Google Scholar] [CrossRef]
From QR Code.com. Denso-Wave. “QR Code Standardization”. Available online: http://www.qrcode.com/en/about/standards.html (accessed on 23 May 2016).
Sadikin, M.A.; Sunaringtyas, S.U. Implementing digital signature for the secure electronic prescription using QR-code based on Android smartphone. In Proceedings of the 2016 International Seminar on Application for Technology of Information and Communication (ISemantic), Semarang, Indonesia, 5–6 August 2016; pp. 306–311. [Google Scholar] [CrossRef]
Rodrigues, B.; Chaudhari, A.; More, S. Two factor verification using QR-code: A unique authentication system for Android smartphone users. In Proceedings of the 2nd International Conference on Contemporary Computing and Informatics (IC3I), Greater Noida, India, 14–17 December 2016; pp. 457–462. [Google Scholar] [CrossRef]
Jagodić, D.; Vujiĉić, D.; Ranđić, S. Android system for identification of objects based on QR code. In Proceedings of the 2015 23rd Telecommunications Forum Telfor (TELFOR), Belgrade, Serbia, 24–26 November 2015; pp. 922–925. [Google Scholar] [CrossRef]
Divya, R.; Muthukumarasamy, S. An impervious QR-based visual authentication protocols to prevent black-bag cryptanalysis. In Proceedings of the 2015 IEEE 9th International Conference on Intelligent Systems and Control (ISCO), Coimbatore, India, 9–10 January 2015; pp. 1–6. [Google Scholar] [CrossRef]
Patil, D.; Guru, S.K. Secured authentication using challenge-response and quick-response code for Android mobiles. In Proceedings of the International Conference on Information Communication and Embedded Systems (ICICES2014), Chennai, India, 27–28 February 2014; pp. 1–4. [Google Scholar] [CrossRef]
Bani-Hani, R.M.; Wahsheh, Y.A.; Al-Sarhan, M.B. Secure QR code system. In Proceedings of the 2014 10th International Conference on Innovations in Information Technology (IIT), Al Ain, United Arab Emirates, 9–11 November 2014; pp. 1–6. [Google Scholar] [CrossRef]
Dey, S.; Nath, A.; Agarwal, S. Confidential Encrypted Data Hiding and Retrieval Using QR Authentication System. In Proceedings of the 2013 International Conference on Communication Systems and Network Technologies, Gwalior, India, 6–8 April 2013; pp. 512–517. [Google Scholar] [CrossRef]
Ravi, R.V.; Dutta, P.K.; Roy, S. Color Image Cryptography Using Block and Pixel-Wise Permutations with 3D Chaotic Diffusion in Metaverse. In International Conferences on Artificial Intelligence and Computer Vision; Springer: Cham, Switzerland, 2023. [Google Scholar]
De Lorenzis, F.; Visconti, A.; Marani, M.; Prifti, E.; Andiloro, C.; Cannavò, A.; Lamberti, F. 3DK-Reate: Create Your Own 3D Key for Distributed Authentication in the Metaverse. In Proceedings of the 2023 IEEE Gaming, Entertainment, and Media Conference (GEM), Bridgetown, Barbados, 19–22 November 2023; pp. 1–6. [Google Scholar]
González, N.M.; Bozkir, E. Eye-Tracking Devices for Virtual and Augmented Reality Metaverse Environments and Their Compatibility with the European Union General Data Protection Regulation. Digit. Soc. 2024, 3, 39. [Google Scholar] [CrossRef]
Lin, C.-C.; Nshimiyimana, A.; SaberiKamarposhti, M.; Elbasi, E. Authentication Framework for Augmented Reality with Data-Hiding Technique. Symmetry 2024, 16, 1253. [Google Scholar] [CrossRef]
El-Hajj, M. Cybersecurity and Privacy Challenges in Extended Reality: Threats, Solutions, and Risk Mitigation Strategies. Virtual Worlds 2025, 4, 1. [Google Scholar] [CrossRef]
Cruz, A.C.; Costa, R.L.d.C.; Santos, L.; Rabadão, C.; Marto, A.; Gonçalves, A. Assessing User Perceptions and Preferences on Applying Obfuscation Techniques for Privacy Protection in Augmented Reality. Future Internet 2025, 17, 55. [Google Scholar] [CrossRef]
Paillier, P. Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In EUROCRYPT 1999; Springer: Berlin/Heidelberg, Germany, 1999; pp. 223–238. [Google Scholar] [CrossRef]
Mazonka, O.; Tsoutsos, N.G.; Maniatakos, M. Cryptoleq: A Heterogeneous Abstract Machine for Encrypted and Unencrypted Computation. IEEE Trans. Inf. Forensics Secur. 2016, 11, 2123–2138. [Google Scholar] [CrossRef]
Android Developers. Available online: https://developer.android.com/index.html (accessed on 18 April 2025).
Massandy, D.T.; Munir, I.R. Secured video streaming development on smartphones with Android platform. In Proceedings of the 2012 7th International Conference on Telecommunication Systems, Services, and Applications (TSSA), Denpasar-Bali, Indonesia, 30–31 October 2012; pp. 339–344. [Google Scholar] [CrossRef]
Starnberger, G.; Froihofer, L.; Goeschka, K.M. Qr-tan: Secure mobile transaction authentication. In Proceedings of the 2012 Seventh International Conference on Availability, Reliability and Security, Fukuoka, Japan, 16–19 March 2009; pp. 578–583. [Google Scholar]
Guruswami, V.; Sudan, M. Improved decoding of Reed-Solomon codes and algebraic geometry codes. IEEE Trans. Inf. Theory 1999, 45, 1757–1767. [Google Scholar] [CrossRef]
Bikos, A.N.; Sklavos, N. The Future of Privacy and Trust on the Internet of Things (IoT) for Healthcare: Concepts, Challenges, and Security Threat Mitigations. In Recent Advances in Security, Privacy, and Trust for Internet of Things (IoT) and Cyber- Physical Systems (CPS); Li, K.-C., Gupta, B.B., Agrawal, D.P., Eds.; CRC Press: Boca Raton, FL, USA, 2020; ISBN 9780367220655. [Google Scholar]
Chan, T.H.; Ho, S.W.; Yamamoto, H. Private information retrieval for coded storage. In Proceedings of the 2015 IEEE International Symposium on Information Theory (ISIT), Hong Kong, China, 14–19 June 2015; pp. 2842–2846. [Google Scholar] [CrossRef]
Kamara, S.; Papamanthou, C. Parallel and Dynamic Searchable Symmetric Encryption. In Proceedings of the Financial Cryptography and Data Security—17th International Conference, FC 2013, Okinawa, Japan, 1–5 April 2013. [Google Scholar]
Yi, X.; Kaosar, G.; Paulet, R.; Bertino, E. Single-Database Private Information Retrieval from Fully Homomorphic Encryption. IEEE Trans. Knowl. Data Eng. 2013, 25, 1125–1134. [Google Scholar] [CrossRef]
Pawar, V.V.; Singh, S.K.; Duggal, M.; Irabatti, A. The Metaverse: An efficient and adaptable virtual reality platform for medical education and treatment. J. Educ. Health Promot. 2024, 13, 415. [Google Scholar] [CrossRef] [PubMed]
Jana, S.; Narayanan, A.; Shmatikov, V. A scanner darkly: Protecting user privacy from perceptual applications. In Proceedings of the IEEE Symposium on Security and Privacy, Berkeley, CA, USA, 19–22 May 2013; pp. 349–363. [Google Scholar]
Bálint, Z.; Kiss, B.; Magyari, B.; Simon, K. Augmented reality and image recognition based framework for treasure hunt games. In Proceedings of the 2012 IEEE 10th Jubilee International Symposium on Intelligent Systems and Informatics, Subotica, Serbia, 20–22 September 2012; pp. 147–152. [Google Scholar] [CrossRef]
Du, R.; Lee, E.; Varshney, A. Tracking-Tolerant Visual Cryptography. In Proceedings of the 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Osaka, Japan, 23–27 March 2019; pp. 902–903. [Google Scholar]
Gao, Y. Secure Key Exchange Protocol based on Virtual Proof of Reality. IACR Cryptol. ePrint Arch. 2015, 524. [Google Scholar]
Pahi, S.; Schroeder, C. Extended Privacy for Extended Reality: XR Technology Has 99 Problems and Privacy Is Several of Them. Notre Dame J. Emerg. Tech. 2023, 4, 1. [Google Scholar] [CrossRef]
Jones, D.; Ghasemi, S.; Gračanin, D.; Azab, M. Privacy, safety, and security in extended reality: User experience challenges for neurodiverse users. In HCI for Cybersecurity, Privacy and Trust, Proceedings of the International Conference on Human-Computer Interaction, Copenhagen, Denmark, 23–28 July 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 511–528. [Google Scholar]
Warin, C.; Reinhardt, D. Vision: Usable Privacy for XR in the Era of the Metaverse. In Proceedings of the 2022 European Symposium on Usable Security, Karlsruhe, Germany, 29–30 September 2022; pp. 111–116. [Google Scholar]
Acheampong, R.; Balan, T.C.; Popovici, D.M.; Rekeraho, A. Embracing XR System Without Compromising on Security and Privacy. In Extended Reality, Proceedings of the International Conference on Extended Reality, Lecce, Italy, 6–9 September 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 104–120. [Google Scholar]
GitHub Qrcode-Terminal. Available online: https://github.com/gtanner/qrcode-terminal (accessed on 18 April 2025).
Kanai, J.; Nartker, T.A.; Rice, S.; Nagy, G. Performance metrics for document understanding systems. In Proceedings of the Second International Conference on Document Analysis and Recognition, Tsukuba Science City, Japan, 20–22 October 1993; pp. 424–427. [Google Scholar] [CrossRef]
Official ZXing (“Zebra Crossing”) Project Home. Available online: https://github.com/zxing/zxing (accessed on 18 April 2025).
Stefanov, E.; Shi, E.; Song, D. Towards practical oblivious RAM. In Proceedings of the NDSS 2012, San Diego, CA, USA, 5–8 February 2012. [Google Scholar]
Canbay, Y.; Utku, A.; Canbay, P. Privacy Concerns and Measures in Metaverse: A Review. In Proceedings of the 2022 15th International Conference on Information Security and Cryptography (ISCTURKEY), Ankara, Turkey, 19–20 October 2022; pp. 80–85. [Google Scholar]

Figure 1. Pristine ‘h’ (leftmost); broken ‘h’ (middle); (rightmost) features matched to classifier prototypes [4].

Figure 2. Structure of a QR code, highlighting functional elements [8].

Figure 3. The architecture of eHealth cloud data storage service.

Figure 4. The encode–decode/encrypt–decrypt full cycle.

Figure 5. Overview of the PRIVocular stateflow functionality: (a) key generation phase, (b) encryption phase (client side), and (c) decryption phase (server side).

Figure 6. PRIVocular combined with VR smart glasses.

Figure 7. PRIVocular’s typical applicability scenario.

Figure 8. The key generation and distribution technique.

Figure 9. The key QRs’ content data.

Figure 10. The OCR mode of operation functionality.

Figure 11. Example deployment scenario of OCR filter algorithm.

Figure 12. A MoMAtag image sample.

Figure 13. MoMAtag operation mode.

Figure 14. Hybrid mode of operation.

Figure 15. Experimental setup.

Figure 16. Image type 1 entropy.

Figure 17. Image type 1 grayscale histogram.

Figure 18. Image type 1 noise.

Figure 19. Image type 2 entropy.

Figure 20. Image type 2 grayscale histogram.

Table 1. QR error-correction levels.

ECC Level	Amount of Correc Table Data
Level L (Low)	7% of codewords can be restored
Level M (Medium)	15% of codewords can be restored
Level Q (Quartile)	25% of codewords can be restored
Level H (High)	30% of codewords can be restored

Table 2. Comparison features between QR-40H and MoMAtag.

	QR	MoMAtag
ECC Level:	Level H (High)	Level H (High)
Data encoding format:	binary/byte	binary/byte
bits/char:	8	8
max. characters:	1273	2049
EC Code Words Per Block:	30	30
Block 1 Count:	20	1
Block 1 Data Code Words:	15	2334
Block 2 Count:	61	0
Block 1 Data Code Words:	16	0

Table 3. Visual analysis table.

Input Mode	Num. of Elements	Max. Characters	Bits/Char.	Encoding	Square Pixels Per Element	Distance
OCR	1200	600	8	HEX/byte	1683	45 cm
MoMAtag	2048	1024	8	Binary/byte	1,176,120	60 cm
Hybrid	640	320	8	Binary/byte	49,163	100 cm

Table 4. Visual analysis (OCR) table.

Key Size (bits)	32	64	128	256	512	1024	2048	4096
Num. of recognized elements (ASCII)	47	22	9	4	1	NA	NA	NA
Max. characters (HEX)	752	704	576	512	256	NA	NA	NA
Square Pixels Per Element Character (pixels)	2184	2184	2184	2184	5332	NA	NA	NA
Physical distance (cm)	43	43	43	43	43	NA	NA	NA
Mean confidence	75	84	85	81	91	NA	NA	NA
Time required (ms)	30,730	30,137	22,450	20,825	13,954	NA	NA	NA

Table 5. Visual analysis (QR-based) table.

Key Size (bits)	32	64	128	256	512	1024	2048	4096
Num. of recognized elements (ASCII)	24	18	8	6	3	2	2	1
Max. characters (HEX)	384	576	512	768	768	1024	2048	2048
Square Pixels Per Element Character (pixels)	54,756	74,529	125,316	190,096	351,649	625,681	447561	2,190,400
Physical distance (cm)	43	43	43	43	43	43	43	43
Mean confidence	-	-	-	-	-	-	-	-
Time required (ms)	2140	1040	720	870	800	860	940	3500

Table 6. Comparison of several PIR schemes (from the communication complexity perspective).

Security Scheme	Supported Queries	Privacy Strength	Computational Complexity	Limitations
Private information retrieval from FHE	All Types of queries	Strong	$O (m γ^{2} l o g (m) + n γ / 2)$	Expensive secure hardware
Oblivious RAM [30]	Range and join queries	Can achieve privacy/efficiency trade-off	$Θ (\sqrt{n})$	The cloud’s computation power is under used
Traditional PIR	Keyword search queries	Practical	$O (l \cdot n \cdot l o g (k) \cdot l o g (l o g (k)))$	Part of query computation overhead is shifted to data owners
Query-specific encryption [44]	Keyword search queries	Strong	$O (\frac{r}{p} l o g n)$	Not all queries are supported
PRIVocular	All types of queries	Strong	$Θ (n)$	Visible light conditions can vary

i. m corresponds to number of blocks, the database DB has n bits, and the ciphertext size is

γ

. ii. n is the number of database items. iii. k is a security parameter, with database size of n elements, each of bit length l. iv. With n we denote the size of the document collection, with r the number of documents containing keyword w, and with p the number of cores.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bikos, A.N. PRIVocular: Enhancing User Privacy Through Air-Gapped Communication Channels. Cryptography 2025, 9, 29. https://doi.org/10.3390/cryptography9020029

AMA Style

Bikos AN. PRIVocular: Enhancing User Privacy Through Air-Gapped Communication Channels. Cryptography. 2025; 9(2):29. https://doi.org/10.3390/cryptography9020029

Chicago/Turabian Style

Bikos, Anastasios N. 2025. "PRIVocular: Enhancing User Privacy Through Air-Gapped Communication Channels" Cryptography 9, no. 2: 29. https://doi.org/10.3390/cryptography9020029

APA Style

Bikos, A. N. (2025). PRIVocular: Enhancing User Privacy Through Air-Gapped Communication Channels. Cryptography, 9(2), 29. https://doi.org/10.3390/cryptography9020029

Article Menu

PRIVocular: Enhancing User Privacy Through Air-Gapped Communication Channels

Abstract

1. Introduction

2. Background

2.1. Tesseract OCR Engine

2.2. QR Code

2.3. Paillier Encryption Scheme

2.4. Use Case Paradigm

2.5. Security Design Goals of the Medical Use Case

2.6. Related Work

3. The PRIVocular Framework

3.1. Cryptographic Properties

3.2. PRIVocular’s Architecture

3.3. eHealth Use Case Design Applicability

3.4. PRIVocular Key Components

3.4.1. Key Generation and Distribution Server

3.4.2. OCR

3.4.3. MoMAtag

3.4.4. Hybrid

4. Experimental Results and Analysis

4.1. Experimental Analysis

4.1.1. QR Low

4.1.2. Text/OCR

4.2. Performance Evaluation of PRIVocular Framework

5. Conclusions and Future Work

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI