1. Introduction
There has been an exponential increase in image-based data transmission in recent years, which is mainly due to the adoption of various platforms in digital communication, viz. IoT-centric applications, social media, and cloud storage, etc. All the information is quite sensitive, while it is challenging to maintain a higher degree of integrity and confidentiality associated with such data files. These forms of textual content can be widely in secure governmental communication, financial document exchange, and medical imaging. The concern associated with image security has always been critically high, which is mainly due to the influx of e-commerce platforms, cloud storage, social media sharing, online communication, etc. As images carry potentially sensitive information, their leakage or any other form of security breach is a serious concern [
1]. It has been shown that conventional cryptographic-centric methods mainly emphasise data transformation using either symmetric or asymmetric algorithms; however, they essentially lack any form of resistance towards dynamic forms of threats, while no optimisation is carried out to boost reconstruction fidelity. Conventional encryption schemes as well as chaotic map-based methods are frequently used towards securing images [
2]. As these encryption schemes are based on permuting pixel values followed by generating a pattern to resist any form of unauthorised access, they are considered to be quite strong and viable. However, encryption schemes are not proactive in nature [
3]. At the same time, Artificial Intelligence (AI) has emerged as a promising solution, specifically adopting deep learning (DL) models such as Generative Adversarial Networks (GANs) and Autoencoders [
4,
5]. These approaches have already proven their potential when applied to learning data representation of complex forms towards anomaly detection and transmission of secured images. Unfortunately, it demands voluminous training data, which often results in increased computational overhead, while it does not successfully facilitate any form of robustness in generating encryption keys. Apart from this, there is no integration noted between AI principles and encryption models, which further results in degraded performance in terms of resistance against adversarial attacks, convergence stability, and reconstruction errors [
6,
7]. Hence, there is a significant research gap associated with constructing an integrated computational security model that can address efficient reconstruction of textual image data, perform adaptive optimisation, and generate a secure key. The motivation of the proposed model arises from this requirement itself, targeting towards bridging the gap between neural optimisation and cryptographic robustness. This is accomplished by unifying pseudo-random generation of a secret key along with the adoption of gradient-based learning. Such a form of integration facilitates enhanced computational efficiency and reconstruction accuracy, along with improved data confidentiality. Hence, all these properties make the proposed system ideal for the secure, practical transmission of images.
The aim of the proposed study is to develop an optimised hybrid neural-cryptographic computational model capable of securing textual image data sharing. The contribution of the proposed study is as follows:
The proposed system integrates gradient-optimised neural encryption with pseudo-random cryptographic keys, unlike existing systems, which treat encryption as a separate entity from learning.
Different from existing solutions, where the complete image is encrypted uniformly, the proposed model performs fuzzy batch-normalised extraction of the textual region before performing encryption, enhancing reconstruction stability and controlling redundancy.
The proposed framework contributes to an adaptive threshold generation deploying an approximation of two-point gradients, whereas the existing solution uses fixed thresholds.
The proposed method contributes to a lightweight computational model, even without using a GPU, and even integrating encryption with deep learning.
The summary of the key innovations of the proposed study is as follows: The proposed TPR-BBGAN introduces a tightly coupled integrated architecture, which prompts implementation towards optimisation tasks, secure encryption, and the reconstruction process. This is completely different from classical methods that consider both key generation and neural learning as two separate, standalone processes. The originality of the TPR-BBGAN lies in its capability towards integrating gradient-centric adaptive learning with pseudo-random key generation. This facilitates enhanced convergence behaviour and dynamic thresholding. Apart from this, the proposed model uses preprocessing coupled with a lightweight design of an architecture where the regions dominated by textual contents are prioritised. This way, the proposed model achieves a state of equilibrium between computational efficiency and security robustness. Such a comprehensive unified architectural model differs from the proposed model in that it either depends upon loosely connected entities or a static configuration.
The organisation of the paper is as follows:
Section 2 presents a review of the literature,
Section 3 presents a discussion of the adopted research methodology, and
Section 4 presents an extensive result discussion, while the conclusion is given in
Section 5.
2. Related Works
The existing literature has witnessed various trends in its methods towards securing data. The adoption of chaotic systems is noted in the work of Chen et al. [
8], where the Least Significant Byte (LSB) has been used for encryption considering dynamic auxiliary input. Wang et al. [
9] have used compression and encryption of an image using an autoencoder to accomplish enhanced encryption reliability. Iqbal et al. [
10] have used a Convolution Neural Network (CNN) for integrating both steganography and encryption towards minimising vulnerabilities during image transmission. Maity et al. [
11] have integrated a vision transformer-oriented autoencoder with a multi-dimensional hyperchaotic system. The idea was to generate a scrambled latent image with pseudo-random sequences for efficient encryption of colour images. Alexan et al. [
12] have presented chaos-centric image encryption, where a nonlinear transformation is used. Chen et al. [
13] have investigated a deep learning-based method towards image encryption by transforming plain images into secret images with noise. The idea is to render higher non-detectability while transmitting images similar to the Hao et al. [
14] scheme. Kumar and Revathi [
15] have presented a deep neural network integrated with a logistic map incorporated within a discrete memory system. The model is reported to thwart vulnerabilities in an insecure channel. Duggirala et al. [
16] have integrated a CNN along with a chaotic logistic map and an autoencoder for strengthening the resilience and encryption. Rohhila and Singh [
17] have studied various types of deep learning models for the secure transmission of an image and found out that almost all the models suffer from acute computational overheads, even though some work claims of extended security.
Visual cryptography is another frequently adopted approach, as seen in the work of Ren and Zhang [
18], while attempting secure image transmission. The model is seen to integrate visual cryptography for securing the secret shares against any form of potential breaches. Stoycheva et al. [
19] have used an Artificial Neural Network (ANN) fused with chaotic synchronisation to generate a multilayered encryption platform of an image with high entropy. Panwar et al. [
20] used a Generative Adversarial Network (GAN) on encryption for enhancing resistance against plaintext intrusion on an image. Apart from this, there are various types of methodologies being introduced for image security, viz., neural network-based approaches (Lyu [
21]), image encryption schemes (Alghami and Munir [
22]), deep learning-based visual encryption (Iqbal et al. [
10], Ali [
23], Alserhani et al. [
24]), and chaotic approaches (Mahalaksmi & Nagarajan [
25], Niu et al. [
26]).
Existing studies have also investigated hybrid deep learning as well as encryption-centric frameworks focusing on image communication and claiming adaptable and robust outcomes [
27,
28,
29]. There is also literature related to chaotic and hyperchaotic systems for enhanced randomness and key space related to image encryption. For example, methods using three-dimensional (3D) logistic map-based encryption have exhibited a significant preventive nature towards resisting brute-force and statistical attacks. They do so by maximising unpredictability and system complexity. These approaches offer potential insight to enhance the strategies of key generation within a neural encryption system [
30,
31]. There are also certain studies using enhanced cryptanalysis and sine-based reconstruction systems to boost the strength of chaotic encryption [
32]. Such methodologies are found to offer only conceptual guidance towards boosting security strength and optimising fidelity of reconstruction within the neural encryption model [
33].
Irrespective of extensive studies on image encryption, it is noted that existing models offer insufficient integration between cryptography and machine learning approaches. Adoption of conventional neural cryptography suffers from a higher reconstruction error as well as suboptimal convergence. Almost all the existing models depend upon either static or heuristic thresholds in order to perform security validation, thereby adversely affecting the reliability and adaptability of the model. Finally, weak randomness in key generation is noted within all conventional solutions, while there is a lack of any text-centric preprocessing carried out. All these research problems are identified and addressed in the proposed solution.
3. Research Method
The proposed study introduces a novel hybrid cryptographic model named Twister Pseudo Random and Barzilai–Borwein Gradient Autoencoder Neural Network (TPR-BBGAN). This model targets securing the data-sharing mechanism for textual images.
This is accomplished by integrating pseudo-random key generation with encryption based on deep learning, along with reconstruction optimisation. The complete implementation is carried out in three core stages: preprocessing, key generation, and encryption/decryption.
Figure 1 showcases the architecture constructed for the proposed model, where edge computing devices, along with an edge-AI environment and AI sensors, were created. The raw textual image data is then subjected to preprocessing, followed by public and private key generation. Encryption is carried out on preprocessed textual image data using the proposed BBGAN, which ensures optimal security and privacy while sharing data among users. Further, this section illustrates the complete operations involved in the architectural design and implementation.
It is essential to understand that “BBGAN” linked with the proposed computational framework does not represent any classical GAN framework. It represents a novel autoencoder architecture that is optimised by a gradient approach adopting Barzilai–Borwein optimisation. This terminology depicts the generative potential characteristics of an autoencoder associated with rebuilding the representation of the encryption model instead of training adversaries. Different from conventional GANs, the proposed model does not involve a discriminator network; however, it emphasises optimised reconstruction and the encoding–decoding process to facilitate secure communication.
3.1. Dataset Description
The dataset employed in our work is the BOSSBase v1.0.1 dataset for conducting simulation [
34]. To ensure secure data sharing using cryptography, a total of 10,000 grey and white images with pixel values ranging between ‘0’ and ‘255’, of size ‘256 × 256’ pixels, are employed in this work. With two distinct types of images present in this dataset, natural images and text images, the text images are employed for secure data sharing using cryptography. Upon successful triggering of the dataset for conducting secure data sharing using cryptography, the images in the dataset have been split into random training and test sets. The training and testing sets employed in the proposed method include 60% and 40% of training images and testing images, respectively.
3.2. Preprocessing Module
The proposed system implements preprocessing to improve the textual features to be extracted before encryption. With image text acquired as input from the BOSSBase v1.0.1 dataset, the subsequent network structure has to be modelled owing to the uniqueness of encryption tasks. As a result, the network structure is designed considering image text, while subsequently, the encryption results’ accuracy is analysed and validated by fine-tuning the network structure. This is performed by applying a fuzzy batch-normalised preprocessing model. The objective behind the design of Fuzzy Batch Normalisation preprocessing is to better extract the characteristics of the image text.
Figure 2 shows the structure of the Fuzzy Batch-Normalised Preprocessing model.
The proposed study model considers ‘textual image’ as an image that is populated with primary informative components of textual content. Some suitable examples could be scene text images, printed text, handwritten notes, scanned documents, etc. Different from natural images, there is a unique structural characteristic for a textual image involving a sparse semantic region, edge-dominant patterns, and high-contrast regions. Such images are identified by the proposed model by giving more importance to the extraction and encryption process towards the regions with dominant textual contents via preprocessing using fuzzy batch normalisation. Such approaches are very different from image encryption methods that consider all pixels in a similar way without any consideration of semantic priorities.
Figure 2 above shows ‘128’ batches of image text sized ‘32 × 32 × 64’, forming input to measure the mean value as eigenvalues of ‘128’ batches sized ‘1 × 1 × 64’ respectively. Finally, ‘128’ batches of image text are synthesised into a resultant feature vector sized ‘1 × 1 × 1 × 64’. Following this, a fuzzy function is applied to the sample text image to retrieve the textual portions, as given below
From the above Equation (1), the fuzzy set for the corresponding sample text image ‘’ results are obtained by taking into consideration ‘’, referred to as the membership degree of text image ‘’ in the fuzzy set ‘’ respectively. Even though a parametric formulation is used for defining a membership function, statistical properties associated with the batches of an input image (viz., mean and variance) that are dynamically computed potentially influence its behaviour. This phenomenon assists in finetuning the membership boundaries semi-adaptively while performing preprocessing. The future line of this study can also be embedding fuzzy membership functions that are fully learnable, adopting parameterisation of neural networks in order to improve the adaptability. The resultant value ‘’ here means that ‘’ is not a member of the fuzzy set (i.e., not highly image text), whereas the value ‘’ means that ‘’ is entirely a member of the fuzzy set (i.e., highly image text). The portions showing the value ‘’ are referred to as high representations of textual portions and are then employed for further processing.
3.3. Key Generator Module
A robust key generation is highly demanded for secure communication, while the proposed system generates public and private keys. In the presented work, public key PubK is generated using the Prime Random Public Key Generator function, and the private key PrivK is generated employing the Mersenne Twister Pseudo Random Private Key Generator function. With this, the overall key generator is performed as the second step for secure data sharing using the Prime Random Mersenne Twister Pseudo Random Key Generator model.
Figure 3 shows the structure of the Prime Random Mersenne Twister Pseudo Random Key Generator model.
As illustrated in
Figure 3, to start with, a Random Prime Public Key Generator is employed in our work for modeling the public key. The public key for each text image is mathematically formulated, as given below.
From the above Equation (2), the public key for the corresponding user ‘
’ is obtained by means of acquiring three random prime numbers ‘
’, ‘
’, ‘
’ and applying the modulus operator ‘
’ to it. Next, a strong Pseudo Random Private Key Generator is formulated using the Mersenne Twister function, which is based on binary operations and is said to be computationally efficient (i.e., training time) and also with the efficient utilisation of memory, consuming only 624 words of the working area. Then, a series of private keys is defined as a series of ‘
’ with the subsequent recurrence relation, as given below.
From the above Equations (3) and (4), the series of private keys is generated ‘’ for ‘’ sample instances with respect to ‘’ identity. Also, ‘’ and ‘’ represent upper bits and lower bits by means of concatenation of bit vectors ‘’ and bitwise XOR operator ‘’ respectively. Finally, the Mersenne Twister Transformation function ‘’ is generated, employing ‘’ and ‘’, signifying the identity matrix and number of bits in the corresponding text image. In this manner, the public key generated is given to all users in the network, whereas text images encrypted with the public key are viewed only by the intended recipient upon decryption with the private key, therefore ensuring secure data sharing.
3.4. Encryption and Decryption Module
The main objective of this encryption model is to encrypt the text image by implementing a non-recursive encryption model for making certain robust backward and forward secrecy, along with the assurance of optimal text image procurement while encrypting. On the other hand, the output of the encryption algorithm is primarily an encrypted text image that is supposed to be stored on a physical server and used for further processing. An autoencoder, being an Artificial Neural Network (ANN), employs an unsupervised learning mechanism encapsulating the text image patterns and interconnections in the data. In the proposed work, the autoencoder is employed to ensure secure data sharing between users. The autoencoder architecture comprises an encoder and a decoder. On the one hand, the autoencoder has two parts: encoding and decoding. During encoding, the artificial neural network compresses the text image data into a lower-dimensional representation, whereas during decoding, the neural network reconstructs the input text image data by training. In our work, training is performed by applying the Barzilai–Borwein gradient to minimise the deviation between the input text image data and the reconstructed text image data.
Figure 4 shows the structure of the BBGAN-based encryption/decryption model.
As illustrated in
Figure 4, the network structure of the Barzilai–Borwein Gradient Autoencoder Neural Network-based encryption/decryption model comprises an input layer, a hidden layer and an output layer. The input layer and the hidden layer constitute an encoder, whereas the hidden layer and the output layer constitute a decoder. Also, the hidden layer serves as both the encoder output and decoder input. Then, the formulation of the input layer is given below.
From the above Equation (5), let us consider the ‘
’ text image input node represented by ‘
’ and the ‘
’ hidden node represented by ‘
’. Then, the weight between the ‘
’ text image input node and the ‘
’ hidden node is represented by ‘
’, with ‘
’ denoting the bias of the ‘
’ hidden node, respectively. Moreover, the hidden node value is not directly utilised as the input of the next layer but requires being transformed or fine-tuned by employing the ReLU activation function. This is mathematically formulated, as given below.
In addition to the hidden node fine-tune function, as given in the above Equation (6), a rational value for weight has to be utilised to adjust the connection between the input (i.e., text image data) and output layer (i.e., encrypted text image data) and procure the output results with minimal loss. Then, the encoder function is expressed, as given below.
From the above Equation (7), ‘
’ denotes the encoder text image data representation, activated by an activation function ‘
’, with corresponding weight and bias matrix denoted as ‘
’ and ‘
’ for the corresponding text image data ‘
’. While compressing the text image data into a lower-dimensional latent data representation, significant features are captured. On the other hand, the decoder acquires latent data representation as input and reconstructs the original text image data. The reverse operation of the encoder or the decoder function is expressed, as given below.
From the above Equation (8), ‘
’ represents the decoder’s activation function that is activated by means of the decoder weight matrix ‘
’ and decoder bias vector ‘
’ for the corresponding encrypted text image data ‘
’. To ensure secure and robust data sharing between users, the reconstruction error is formulated, as given below.
From the above Equation (9), the reconstruction error ‘’ is formulated based on the difference between the original text image data ‘’ and the reconstructed text image data ‘’. The proposed method predicts the efficiency of the secret data sharing mechanism on the basis of the reconstruction error. If the reconstruction error exceeds the threshold value, then the receiving end user is malicious, and the proposed method eliminates the user from the network to impart maximum security and privacy to the data.
On the other hand, if the reconstruction error is found to be below the threshold value, then the receiving end user is not found to be malicious, and secure data sharing is ensured between users in the network. The threshold value ‘’ is selected in such a manner as not to be high or low. It is selected employing the Barzilai–Borwein Gradient function, and as weight and bias have a strong influence, the gradient of the objective function with respect to parameters ‘’ and ‘’. We can finally find the point with the least error.
The threshold value ‘
’ identification employing Barzilai–Borwein Gradient function is derived from the linear trend of the most recent two iterates, ‘
’ and ‘
’. Let ‘
’ represent the gradient vector at iteration ‘
’, and let there be two iterates ‘
’ and ‘
’, in which ‘
’, with ‘
’ representing the preceding iteration step size, respectively. Then, Barzilai–Borwein Gradient function for obtaining optimal threshold value ‘
’ is mathematically represented, as given below.
From the above Equation (10), employing two-point approximations based on ‘’ and ‘’, tt has a great advantage in solving an extensive range of constrained optimisation problems.
4. Result Analysis
The proposed TPR-BBGAN has been implemented using the Python 3.14.1 high-level general-purpose programming language on a computer with a Core i5 processor, a 2.4 GHz machine, and 8GB RAM. There are three primary layers in the proposed architecture of an autoencoder: an input layer, a hidden layer for encoding, and an output layer for decoding. Patches of textual image of size 32 × 32 × 64 have been considered as an input layer, while there are two fully connected layers in the encoder, along with ReLU activation. This potentially minimises the dimensionality to 1 × 1 × 164 size of latent representation. The decoder replicates the encoder structure in order to reconstruct the input image. The training of the model is carried out using the Barzilai–Borwein gradient optimisation method, considering the size of adaptive steps. The learning process is done considering a batch size of 128, with 100 epochs. Xavier initialisation has been carried out for initialising the weights, while there was no involved GPU in the presented experimental analysis. Scripted in Python programming, the implementation uses various deep learning libraries to ensure effective reproducibility. The experimentation has fixed the usage of all hyperparameters, like iteration counts, batch size, and initialisation of learning rate.
Multiple textual images of different sizes were utilised, and multiple keys were utilised to measure all the values necessary to evaluate the performance and efficiency of the proposed method.
Figure 5 depicts the image set used in the implementation, while
Table 1 lists basic information about the images.
The key expected benefit of the Twister Pseudo Random and Barzilai–Borwein Gradient Autoencoder Neural Network (TPR-BBGAN) is to improve the efficiency of encryption–decryption operations to ensure secure data sharing. The efficiency of the TPR-BBGAN method was studied in terms of data confidentiality, data integrity, BER and PSNR. While designing a model for secure data sharing, one of the most significant performance metrics is data confidentiality. The data confidentiality rate is referred to as the percentage ratio of the number of sample textual image data that are received by the intended receiver and is mathematically represented, as given below:
From the above Equation (11), data confidentiality ‘
’ is estimated by taking into consideration the sample textual image data involved in the secure data sharing process ‘
’ and the sample textual image data accessed only by the intended recipient ‘
’. The data confidentiality rate is evaluated in terms of percentage. Next, to validate and analyse the integrity of textual image data, data integrity is evaluated. Data integrity is measured as the percentage ratio of sample textual image data that are not altered by any users to the overall sample textual image data provided as input. The data integrity is mathematically formulated, as given below.
From the above Equation (12), the data integrity ‘
’ rate is estimated by considering the sample textual image data taken for simulation ‘
’ and the number of textual image data not altered by any malicious users ‘
’. Third, the bit error rate is mathematically estimated to validate the efficiency and effectiveness of the method, and is stated as given below.
From the above Equation (13), the bit error rate ‘
’ is evaluated by taking into consideration the original textual image data ‘
’ and the decoded textual image data ‘
’, with ‘
’ denoting its corresponding dimensions. Finally, the peak signal-to-noise ratio (PSNR) is utilised to measure the efficiency of the proposed method. The higher the PSNR resultant score, the closer the original textual image data and the decoded textual image data. It is measured by the Mean Squared Error (MSE), which refers to the mean of the difference between pixel values of the input original textual image data and the decoded textual image data, respectively. MSE is evaluated, as given below.
From the above Equation (14), the mean square error ‘
’ values are arrived at based on the ‘
’ and the ‘
’ respectively. Then, the peak signal-to-noise ratio is estimated, as given below.
With the above mathematical formulas for data confidentiality, data integrity, BER and PSNR, performance analysis is made by introducing fair comparisons between the proposed TPR-BBGAN method and existing methods, Hybrid Random Forest and Logistic Regression (HRFLR) [
35] and Instant-Hybrid Neural-Cryptography (IHNC) [
36]. Here, fair comparisons refer to the fact that similar numbers and image sizes are employed for validating four performance metrics using the proposed TPR-BBGAN method and existing methods, HRFLR and IHNC.
4.1. Accomplished Results
The initial assessment is carried out for data confidentiality, which refers to the potentiality to scramble the sample text image data utilising encryption algorithms so that it cannot be understood by an unauthorised person who deciphers it. The higher the data confidentiality, the more efficient the method or algorithm applied is said to be.
Table 2, given below, shows the data confidentiality results by substituting the values in Equation (11), and also to ensure fair comparison, same sample text images are applied for all three methods, Twister Pseudo Random and Barzilai–Borwein Gradient Autoencoder Neural Network (TPR-BBGAN), Hybrid Random Forest and Logistic Regression (HRFLR) and Instant-Hybrid Neural-Cryptography (IHNC), and results are provided in
Table 2. Data confidentiality (DC) is represented in percentile form (%), exhibiting the proportion of textual image data accessed only by the target recipient securely.
The second evaluation is carried out for data integrity, which revolves around ensuring that the sample text image remains consistent and free from interference. Deep learning techniques like an autoencoder play a major role in ensuring data integrity by providing a mechanism to detect any prohibited modifications to the sample text image. Higher data integrity ensures efficiency of the method.
Table 3, given below, shows the data integrity results by substituting the values in Equation (12), and also to ensure fair comparison, the same sample text images are applied for all three methods, Twister Pseudo Random and Barzilai–Borwein Gradient Autoencoder Neural Network (TPR-BBGAN), Hybrid Random Forest and Logistic Regression (HRFLR) and Instant-Hybrid Neural-Cryptography (IHNC), and results are provided in
Table 3. Data integrity (DI) is depicted in the form of a percentage (%) of textual image data that remains unchanged while performing transmission.
The outcome shows the data integrity results using the proposed TPR-BBGAN and the existing two methods, HRFLR and IHNC. From
Figure 5, though a decreasing trend in terms of data integrity was observed using the first set of 65 textual image data, an increasing trend in terms of data integrity was found using the remaining set of 130 textual image data, therefore corroborating the objective of an increase in data integrity for a higher set of images. Also, the data integrity results using the proposed TPR-BBGAN method were found to be higher upon comparison to [
1,
2], which is evident from the sample results, i.e., with 26 textual sample image data provided as input, the data integrity rate using the three methods was found to be 88.46%, 84.61% and 76.92%, respectively. From this result, it is ensured that the data integrity using the proposed TPR-BBGAN method is comparatively better than [
1,
2]. The reason for the improvement in data integrity was due to the application of the Barzilai–Borwein gradient function. By applying this function instead of measuring the reconstruction error loss by means of a random threshold, the Barzilai–Borwein gradient function was employed in evaluating the threshold. The optimal threshold here was arrived at using two-point approximations based on two iterates, preceding and succeeding. This, in turn, minimised the number of textual image data not altered by any malicious users using the proposed TPR-BBGAN method and therefore improved the data integrity by 6% compared to [
1] and 17% compared to [
2].
Bit error rate analysis follows next, which refers to the number of countable bit errors occurring over a period of time. Some of the reasons for the occurrence of bit rate may be due to noise or PSNR, signal intervention and so on. The lower the BER, the higher the efficiency of the method is said to be.
Table 4, given below, lists the BER results by substituting the values in Equation (13) for all three methods: Twister Pseudo Random and Barzilai–Borwein Gradient Autoencoder Neural Network (TPR-BBGAN), Hybrid Random Forest and Logistic Regression (HRFLR) and Instant-Hybrid Neural-Cryptography (IHNC). Bit Error Rate (BER) represents the proportion of improper bits of information between the original and reconstructed textual image data.
The trend of outcome above illustrates the graphical representation of BER using the three methods, TPR-BBGAN, HRFLR and IHNC. From
Figure 5, two inferences are found. First, with different sizes and textual image data provided as input, the BER differs for all three methods. Also, with simulations performed using 21.3 kb, the BER using the three methods was observed to be 0.033, 0.0379 and 0.041, respectively. From this result, the BER was observed to be comparatively lower than HRFLR and IHNC. Also, with a 460.0 kb image size, the BER using the three methods was found to be 0.048, 0.061 and 0.072, and on the other hand, with a 48.9 kb image size, the BER using the three methods was found to be 0.035, 0.045 and 0.078, respectively. From this result, a higher image size does not compromise the BER. Also, from the above results, lower BER was observed using the proposed TPR-BBGAN method compared to HRFLR and IHNC. The reason for the minimisation of BER using the proposed TPR-BBGAN method was due to the application of Fuzzy Batch-Normalised Preprocessing. By applying this preprocessing model, a high representation of textual portions was retrieved for further processing. This, in turn, reduced the BER using the proposed TPR-BBGAN method by 30% compared to HRFLR and 46% compared to IHNC, respectively.
Table 5, given below, lists the PSNR results by substituting the values in Equations (14) and (15) for all three methods: Twister Pseudo Random and Barzilai–Borwein Gradient Autoencoder Neural Network (TPR-BBGAN), Hybrid Random Forest and Logistic Regression (HRFLR) and Instant-Hybrid Neural-Cryptography (IHNC).
While performing encoder and decoder operations, the ratio between the original textual image data and the encrypted images has to be measured. PSNR measures the ratio, and the higher the PSNR, the closer the encrypted message is to the original image. The unit for PSNR is decibel (dB), which represents the equivalence between the source and reconstructed image. A higher PSNR indicates enhanced reconstruction quality.
Finally, the above outcomes show the graphical representation of PSNR using the three methods, TPR-BBGAN, HRFLR and IHNC. From the above figure, a high PSNR is found using the proposed method compared to HRFLR and IHNC. Also, the PSNR rate was not observed using the three methods due to different image sizes. Also, PSNR improved using the TPR-BBGAN method due to the application of the Barzilai–Borwein Gradient Autoencoder Neural Network-based encryption/decryption model. By applying this model, the Barzilai–Borwein gradient was used with the objective of reducing the divergence rate between input text image data and reconstructed text image data. With this, the corresponding weight and bias matrix were refined and tuned, therefore improving the PSNR using the TPR-BBGAN method by 6% compared to HRFLR and 20% compared to IHNC.
4.2. Security Analysis
The proposed model also undergoes a comprehensive security analysis in order to assess its encryption strength, adopting certain standards and universally accepted performance metrics. The performance metrics involve image entropy, that assess the arbitrariness of the encrypted image. Another two metrics, i.e., Number of Pixels Change Rate (NPCR) and Unified Average Changing Intensity (UACI), testify to the sensitivity of the encryption scheme when subjected to minor fluctuations in input, thereby exhibiting potential resistance against differential attacks. The assessment has been carried out considering different samples of textual images to ensure reliability and consistency of the outcome, as shown in
Table 6.
The numerical score shown in
Table 6 showcases that the proposed model achieves an entropy value very close to the suitable score of 8 for 8-bit images, showing effective resistance as well as randomness against statistical attacks. The NPCR score is shown to exceed 99%, while the UACI score is approximately 33%, indicating the potential to identify and deflect differential attacks. The study results offer empirical evidence that the proposed model is in compliance with standard security demands of cryptography, while it also retains a higher degree of stability across different samples.
Apart from this part of security analysis using entropy, NPCR, and UACI, the security strength of the proposed system is also tested for key generation methods with respect to brute-force attacks and key space. The proposed model offers collaboration of Mersenne Twister pseudocode, arbitrary generation of private key, and generated prime-centric public key outcomes. This collaboration yields a potentially large space of keys exceeding 2128, which renders exhaustive search computationally impossible for any attacker or any compromised node. This degree of randomness associated with the generated key is further validated by the increased value of entropies noted in the encrypted image. This fact can be stated to be in favour of resisting any form of key-guessing attacks and brute-force attacks.
It is essential to understand that the primary validation mechanism of the proposed study involves detection based on reconstruction error, and yet it can be highly vulnerable to a complex form of adversarial manipulation. In order to address this security problem, an auxiliary validation mechanism can be hosted, where adversarial robustness methods or latent space distribution can be used. Such adoption enhances resilience against all attempts of adversarial bypass and improves the trustworthiness of detecting compromised/victim nodes.
The assessment of the proposed TPR-BBGAN with respect to its convergence is assessed to ensure efficient optimisation and stability. The convergence performance of the proposed model is boosted using the Barzilai–Borwein gradient method, which uses an adaptive step size for enhanced performance, much better in contrast to classical gradient descent. A closer look at the outcome of the proposed model shows minimised reconstruction error, which also contributes to reduced iteration. This shows that the proposed model accomplishes stable convergence behaviour without any reliance on oscillatory characteristics. Apart from this, the usage of two-point gradient approximation facilitates optimal exploration of the optimisation landscape. This also contributes to the probability of local instability or divergence. Based on this studied characteristic, it can be stated that there is effective stability as well as good convergence speed while performing training.
4.3. Extensive Analysis
From the previous section, it has been proven, using varyious examples and parameters, that the proposed system is effective compared to certain existing baseline models. This section further extends this discussion by comparing the performance of the proposed model with different state-of-the-art methods [
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26]. There are different types of methods involved, such as hybrid neural-cryptography, GAN-based methods, DL-based cryptography, and a chaotic encryption framework. The proposed analysis does not depend only on numerical metrics, but it also emphasises various qualitative perspectives too, such as PSNR, BER, inherent methodological limitations, resilience to different threats, etc.
Table 7 showcases such an outcome of evaluation that offers deeper insight into practical utilisation, degree of adaptability, and robustness associated with the proposed study model, as well as other models too. The PSNR value is supposed to be higher, and the BER score is supposed to be lower for realising an effective security model.
The outcome of the extensive analysis shown in
Table 7 above exhibits better consistency of the proposed model, which excels notably in contrast to existing state-of-the-art methods. It also indirectly states that the proposed system offers extensive capabilities to resist different forms of threats of varied intensity levels, viz., adversarial attacks, differential attacks, statistical attacks, brute-force attacks, etc. The rationale for this trend of performance for the proposed system is due to the collaboration of adaptive threshold optimisation and key generation using the Barzilai–Borwein gradient method. On the contrary, the majority of existing methods offer resistance to a particular form of attack, but they also suffer from various shortcomings, such as a lack of adaptive learning methods, high computation overhead, and static parameter tuning. Apart from this, the proposed model is found to offer minimised BER owing to effective suppression of noise using fuzzy batch normalisation preprocessing, while increasing PSNR due to enhanced reconstruction accuracy. Interestingly, it is noted that the proposed model offers potential performance gain, but it has also been noticed that there are slight dependencies on preprocessing that can be further optimised. Cumulatively, the proposed model offers a highly comprehensive and balanced solution for securing textual image data during communication as well as during storage.
4.4. Discussion
From the previous section, it is quite obvious that the TPR-BBGAN excels in performance compared to the existing baseline of HRFLR and IHNC. There are multiple reasons for the observed trends in this outcome.
It is to be noted that HRFLR essentially emphasises the classification task and does not delve much into optimising the secure reconstruction process. Hence, this method significantly lacks any form of dynamic threshold tuning, while there is no gradient-centric adaptive optimisation involved. This leads to increased scores of BER and reduced scores of PSNR. Apart from HRFLR, it is also seen that IHNC mainly depends upon the synchronisation of a static neural-based encryption method. This mechanism does not combine adaptive step-size optimisation or preprocessing refinement that potentially results in substantially higher reconstruction error. Hence, confidentiality rates and data integrity are potentially affected. Both HRFLR and IHNC consider learning components and encryption separately, while the proposed system model integrates optimisation with preprocessing and key generation within an integrated model.
In order to offer more insight into the discussion of the learning outcomes of the proposed study model, the proposed system has been assessed for its computational efficiency, too.
Table 8 showcases the computational efficiency of the proposed model with respect to various baseline models, too. The ideation is towards assessing the suitability associated with the real-time applicability of various models. The analysis of the efficiency is carried out with respect to encryption time, decryption time, and memory usage. The implementation of the proposed system is done on a standard CPU, and no GPU is used for this comparative analysis. A similar environment is also maintained for other baseline models (HRFLR, IHNC, CNN, GAN).
The results in
Table 8 showcase that the TPR-BBGAN offers minimal time consumption for both encryption and decryption in contrast to the existing baseline. A similar trend is also noted for memory consumption, which is mainly due to efficient Barzilai–Borwein optimisation and autoencoder design. Different from GAN-centric models that demand extensive resources, the proposed model retains a better equilibrium between efficiency and security. This is empirical evidence to show that the proposed model is more applicable to a resource-limited edge computing environment.
The better outcome achieved by the proposed model has various concrete reasons. The primary reason is linked with robust preprocessing by TPR-BBGAN, where a fuzzy batch normalisation is carried out for improving the textual region before performing encryption. This task reduces reconstruction distortion and minimises propagation of noise, thereby minimising PSNR while minimising BER. The second reason for the better performance of the proposed system is linked to the usage of dual-level key generation. The TPR-BBGAN contributes to unpredictability and randomness of the secret key due to the combined usage of Mersenne Twister-based private key and prime-based public key. Hence, resistance is significantly improved against any form of unauthorised access to ensure and improve data confidentiality. The third reason is linked to optimised autoencoder training by the TPR-BBGAN. The system boosts convergence performance with the aid of the Barzilai–Borwein method while the reconstruction error is significantly reduced. Different from traditional gradient descent, the TPR-BBGAN uses information from prior iterations for dynamically fine-tuning learning attributes. This method contributes to the effective accuracy of encryption–decryption while data integrity progressively improves. Finally, the TPR-BBGAN performs better than existing methods due to its distinct usage of threshold-based malicious detection. The TPR-BBGAN does not rely upon any static threshold; in fact, it uses gradient-based optimisation for computing an adaptive threshold. The entire process positively affects reliability towards secure data sharing.
5. Conclusions
This manuscript presents the TPR-BBGAN, a hybrid cryptographic framework that integrates Mersenne Twister-based private key generation along with prime-based public key formulation and the Barzilai–Borwein method for adaptive optimisation. Such a form of multilayer integration has resulted in minimised reconstruction error, enhanced convergence speed, and improved randomness. The study outcome shows that the TPR-BBGAN enhances data confidentiality by 12–20%, while a 6–17% improvement is noted for data integrity. The TPR-BBGAN is observed to reduce BER by 30–40%, while PSNR is increased by 6–20% compared to existing models. The outcomes show a greater degree of consistency across tested performance metrics to show that the TPR-BBGAN offers efficiency, robustness, and a novel secure data-sharing model.
Even if the proposed model is noted with potential performance on multiple assessment metrics, its experimentation has been restricted to handling images with complex backgrounds, high-resolution data, and colour images. All the constraints and limiting attributes of the proposed model can be extended in future work to multi-channel image processing. The notion will be to improve the generalisation over different domains of an image.
Future work will be carried out towards using GPU-centric deep learning architectures for implementing blockchain-based key management. The new model can be assessed under complex scenarios of an adversarial attack, while it can be further integrated with quantum-resistant encryption for secure communication.