Improving Face Image Transmission with LoRa Using a Generative Adversarial Network

Babayiğit, Bilal; Yarlı Doğan, Fatma

doi:10.3390/app152111767

Open AccessArticle

Improving Face Image Transmission with LoRa Using a Generative Adversarial Network

by

Bilal Babayiğit

¹

and

Fatma Yarlı Doğan

^2,*

¹

Department of Computer Engineering, Faculty of Engineering, Erciyes University, 38260 Kayseri, Türkiye

²

Department of Computer Engineering, Graduate School of Natural and Applied Sciences, Erciyes University, 38260 Kayseri, Türkiye

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(21), 11767; https://doi.org/10.3390/app152111767 (registering DOI)

Submission received: 8 October 2025 / Revised: 28 October 2025 / Accepted: 1 November 2025 / Published: 4 November 2025

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

Although it is a technology that can be pretty important for remote areas lacking internet or cellular data, the difficulties it presents in large data transmission prevent LoRa from developing sufficiently for image transmission. This challenge is particularly relevant for applications requiring the transfer of facial images, such as remote security or identification. It is possible to overcome these difficulties by reducing the data size through the application of various image processing methods. In the study, the face-focused enhanced super-resolution generative adversarial network (ESRGAN) is trained to address the significant quality loss in low-resolution face images transmitted to the receiver as a result of image processing techniques. Also, the trained ESRGAN model is evaluated comparatively with the Real-ESRGAN model and a standard bicubic interpolation baseline. In addition to Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) metrics, Learned Perceptual Image Patch Similarity (LPIPS) for perceptual quality and a facial identity preservation metric are used to calculate the similarities of the produced super-resolution (SR) images to the original images. The study was tested in practice, demonstrating that a facial image transmitted in 42 min via LoRa can be transmitted in 5 s using image processing techniques and that the images can be improved close to the real images at the receiver. Thus, with an integrated system that enhances the transmitted visual data, it becomes possible to transmit compressed, low-resolution image data using LoRa. The study aims to contribute to remote security or identification studies in regions with difficult internet and cellular data transmission by making significant improvements in image transmission with LoRa.

Keywords:

computer vision; ESRGAN; GAN; Internet of Things (IoT); LoRa; LPWAN

1. Introduction

LoRa (Long Range), an important member of LPWAN (Low Power Wide Area Network) technology, has become a popular research topic in IoT (Internet of Things) projects due to its low-power, long-range data transmission capabilities. LoRa technology is typically used in IoT applications for transmitting smaller amounts of data, such as sensor data, due to its low data transmission speeds and limited bandwidth. Along with small sensor data, image data is also an important metric for many critical applications, such as remote security monitoring, access control, or human identification. LoRa’s long-range communication capabilities and low power consumption will significantly contribute to developing these security and identification-based studies in areas without reliable cellular coverage. However, due to LoRa’s limited data transmission capacity, transmitting image data poses a significant challenge for LoRa. Therefore, solving the image transmission problem with LoRa is a significant research topic. Guerra and colleagues proposed a low-speed encoder for image transmission using LoRa modules. The aim is to compress images before transmission and reduce the impact of data loss that may occur during transmission [1]. Kirichek and colleagues used JPEG and JPEG2000 compression algorithms for image transmission in their study on image and sound transmission via a LoRa module on an unmanned aerial vehicle (UAV). It was determined that the JPEG2000 compression format resulted in less distortion after transmission [2]. Wei and colleagues combined WebP compression and base64 encoding methods to transmit a 200 × 150-pixel image over LoRa in 25.7 s. [3]. Ji and colleagues proposed a technique that divides the image into grid segments and sends only the parts of the image that contain differences [4]. Chen and his colleagues applied JPEG compression to images for agricultural monitoring and transmitted them via LoRa. To reduce transmission time, they proposed a transmission protocol called the Multi-Packet LoRa Protocol (MPLR) with packet acknowledgment requirements. While this approach is suitable for small-sized images, the system becomes inefficient as the amount of data increases [5]. In their research, Haron and colleagues are investigating suitable compression methods for transmitting large-sized images over wireless networks for telemetry data. They used the Fast Fourier Transform (FFT) method to compress a grayscale image with a size of 512 × 512 pixels and a size of 176 KB [6].

This study aims to address the image transmission problem with LoRa by converting image data to low resolution before transmission. Then, at the receiver node, we will enhance these images using the Enhanced Super-Resolution Generative Adversarial Network (ESRGAN) model, producing high-resolution images that closely resemble the original.

Introduced in 2014 by Ian Goodfellow et al. [7] and one of the key components of modern generative models [8], generative adversarial networks (GANs) comprise two competing networks: the Generator and Discriminator networks. For image data, the generative network generates new (fake) images while the discriminator network detects whether the generated images are fake or not. The determinations of the discriminator neural network regarding the images fed to the generative neural network ensure that the generated images become indistinguishable from the real images over time. Figure 1 shows the basic working logic of the GAN architecture.

Following the discovery of the basic GAN architecture, Radford et al. [9] developed the deep convolutional generative adversarial network (DCGAN) in 2015 to produce synthetic images. With the SRGAN [10] proposed in 2017 and the subsequent ESRGAN [11], significant improvements were achieved in enhancing the visual quality of low-resolution images. The development of these studies has revealed the effectiveness of GAN-based approaches in improving images that are corrupted or have low quality in transmission environments with limited bandwidth.

Super-resolution (SR) imaging is the process of obtaining high-resolution (HR) images from low-resolution (LR) images. The theoretical basis of the SR technique dates back to Harris’s 1964 work on information extraction from images generated by diffraction-limited optical systems [12]. Tsai and Huang first addressed the idea of SR to improve the spatial resolution of Landsat TM images [13]. Based on this idea, different methods were used to focus on SR [14,15,16,17]. In studies on face images, face hallucination approaches were applied as an SR method to obtain high-resolution face images from low-resolution face images [18,19].

With the development of deep learning techniques, researchers have achieved significant success by using deep learning-based methods for SR. In particular, GAN [1] have emerged as an effective method for generating high-resolution images from low-resolution images. The SRGAN (Super- Resolution Generative Adversarial Network) model, pioneered by Ledig and colleagues in 2017, has become a key reference in the literature [10]. In 2018, Wang and colleagues developed ESRGAN (Enhanced Super-Resolution Generative Adversarial Network), an improved version of the SRGAN model that produces more realistic and natural textures [11]. In 2021, Wang and colleagues further developed ESRGAN to propose Real ESRGAN, which can better simulate complex real-world distortions [20]. Real-ESRGAN has been used in resolution enhancement of various medical images [21,22,23], creating higher-resolution images from low-quality data in remote monitoring applications such as disaster and landslide detection [24], and general image enhancement applications [25,26].

There are studies in the literature that propose GAN-based restoration methods to solve the problems of low bandwidth and low channel SNR in wireless image transmission [27,28]. However, these studies do not define a specific physical transmission network; they consider the problem under the assumption of a low-bandwidth wireless channel and primarily focus on developing a GAN architecture.

In contrast, our study emphasizes that LoRa communication is a wireless transmission technology that provides transmission over long ranges under low bandwidth conditions and is advantageous in terms of low energy consumption. Accordingly, it proposes a GAN-supported solution to the problem of image transmission over LoRa. To our knowledge, our study is the first to apply a GAN-based image restoration approach to the LoRa-based image transmission problem.

In this study, JPEG or WebP compression is applied after downsampling to reduce the resolution of face images, which are then sent via LoRa. Then, a super-resolution image was produced from the low-resolution image transmitted to the receiver node using the ESRGAN model trained within the scope of the study.

The remainder of this paper is organized in the following sections. Section 2 describes in detail the materials and methods used in the study, as well as the implemented LoRa transmission and ESRGAN experiments. Section 3 presents the results obtained from the experiments and the discussions. Section 4 presents the study conclusions and recommendations for future work.

2. Materials and Methods

Our study aims to reduce transmission times to overcome the difficulties in image transmission with LoRa. To this end, image processing techniques are applied to the images to be transmitted, reducing their resolution, followed by image compression. In this study, JPEG and WebP compression techniques are compared for image compression, and LoRa technology is utilized for transmitting the compressed images. To enhance the resolution of the low-resolution image transmitted to the receiver via LoRa, the Real ESRGAN model and the ESRGAN model trained on face images are utilized on the Raspberry Pi 4 B (Raspberry Pi OS, Debian 11 “Bullseye”), which is connected to the LoRa receiver. Figure 2 shows the workflow of the study.

In this study, a network system consisting of two SX1262 LoRa HAT (E22 900T22S) modules, one for the sender and one for the receiver, mounted on a Raspberry Pi 4, provides point-to-point communication. To ensure that the image is transmitted as quickly as possible, the image to be transmitted on the sender Raspberry Pi 4 is down-sampled, compressed and then converted into a minimal data size using base64 encoding. The base64-encoded string data obtained here is divided into packets suitable for the LoRa module’s limited transmission size and transmitted. For the SX1262 LoRa HAT module, the packet transmission length is 240 bytes in total, consisting of a 6-byte header and a 234-byte message. Therefore, each data packet is packaged to contain 234 bytes of data.

The image data received in packets via LoRa is combined in the order it was received in a *.txt file in the relevant file location on the Raspberry Pi 4. This file, which contains base64 string codes, is then converted into a JPG image. The converted image is of lower quality than the original image due to the downsampling and compression processes applied before transmission. The received low-resolution image is reproduced using super-resolution-based GAN techniques such as ESRGAN and Real ESRGAN. The resulting images are evaluated using peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) metrics.

Informed consent was obtained from the relevant individuals for the use of their face images in most of the descriptions in the study. Other face images were selected from a publicly accessible face database [29]. Additionally, another face image dataset containing open-access images was used to train the ESRGAN network [30].

The study’s methodology comprises five basic steps, as illustrated in Figure 3.

2.1. Compression of the Image to Be Transmitted with LoRa

The methods used to compress an image are divided into two categories: lossy and lossless methods. Because the LoRa network is suitable for low-size data transmission, lossy compression methods are generally used to reduce the size of images transmitted over the LoRa network to the smallest size possible. However, lossless compression methods can also be applied for scenarios where data must be transmitted without loss, such as medical imaging.

Since the study is based on image production techniques using GAN, lossy compression methods are expected to be more effective in terms of the study results. Since our research focuses on the reproduction of low-resolution and degraded images, the degradation in image quality and data loss resulting from lossy image compression is considered part of the study rather than a problem. For this reason, the compression process will be performed using the lossy compression techniques JPEG and WebP in the compression stage to increase compression efficiency. The compression processes we use in the Python programming language, utilizing the Pillow and base64 libraries, significantly reduce the size of the image to be transmitted. As a result, images with considerably reduced resolution are transmitted via LoRa in short periods.

After image transmission, super-resolution images will be generated from the low-resolution image received at the receiver node using Real-ESRGAN and ESRGAN. The generated images will be compared according to PSNR and SSIM metrics.

2.1.1. JPEG Compression

The JPEG compression method offers high compression rates but results in significant quality loss at low bit rates. The original image (1124 × 1679 pixel) and the results of JPEG compression applied to this image at different quality ratios are shown in Figure 4 for comparison. This illustrates that as the quality factor decreases with JPEG compression, the image quality also decreases. However, no dimensional changes are made to the image; only the way the data is stored changes.

Figure 5 shows the results of JPEG compression after a 10× downsampling process.

Downsampling directly reduces the number of pixels by reducing image resolution, thus significantly reducing the amount of data to be processed or compressed. Since we will be performing super-resolution image reproduction with Real-ESRGAN on the receiver node, the images obtained by JPEG compression at different quality factors (q = 10, 30, 60) applied after the downsampling process in Figure 5, will be transmitted via LoRa. This image also meets the minimum data size requirement for rapid image transmission via LoRa.

2.1.2. WEBP Compression

It is a more modern compression method that offers less distortion and a better compression ratio than JPEG. To compare it with JPEG compression, WebP compression with different quality factors (q = 10, 30, 60) was applied to the same original image (1124 × 1679 pixel) and the results are shown in Figure 6. When WebP compression is performed with a quality factor of q = 10, the original image size of 358 KB is reduced to 33.7 KB.

When downsampling is applied before WebP compression, the compression results with different quality factors are shown in Figure 7. According to these results, although the quality perceived by the human eye is significantly reduced, the original 358 KB image is reduced to 0.805 KB as a result of the q = 10 quality factor compression process. In the following sections of the study, this degraded image, sent via LoRa to the receiver LoRa node, was regenerated with Real-ESRGAN and ESRGAN for resolution enhancement. The results are compared with those produced using Real-ESRGAN and ESRGAN for the image compressed with JPEG.

A comparative table of JPEG and WebP compression is shown in Table 1.

According to this table, WebP compression yields significantly better results than JPEG compression. Both images compressed with JPEG and WebP at q = 10 appear to be of inferior quality to the human eye. However, the PSNR values for both results were calculated as 22.87 dB for JPEG (q = 10) and 23.67 dB for WebP (q = 10). Considering these results, WebP compression achieves both a better PSNR and compression than JPEG, achieving the targeted smaller data size.

The image compression was performed on a Raspberry Pi 4, to which the sender’s LoRa module is also connected, using the Pillow and base64 libraries in the Python programming language.

Since JPEG compressed data is in binary format and cannot be reliably stored in plain text files, the binary buffer is encoded using base64 encoding. Base64 encoding converts binary data into ASCII text, providing reversible protection for the content. The resulting base64 string is saved in a .txt file. The image data in this file is read using file reading, divided into packets, and transmitted to the receiver LoRa via the sender LoRa. The data packets read by the receiver LoRa are saved to a .txt file. For base64 decoding, the .txt file is read, and the base64 string is decoded back into its original binary format. The resulting data is then saved as a viewable image file, either .jpg or .webp, depending on the original image format. This method enables end-to-end compression and transmission of image data via plain text while maintaining reasonable visual quality through DCT-based compression algorithm.

2.2. Image Transmission with LoRa

LoRa technology is preferred in IoT applications over other network technologies due to its ability to transmit data over long distances (kilometers) without requiring hops and its low energy requirements. This technology can provide a range of up to 20 km in rural areas and 5 km urban areas [31]. Furthermore, thanks to its low power consumption, devices can boast years of battery life [32,33]. LoRa provides a significant advantage over other LPWAN technologies by using Chirp Spread Spectrum (CSS) modulation technology, which provides an elevated level of interference resistance [34]. In addition, there are no licensing costs because the LoRa network operates on unlicensed frequency bands [35,36]. However, LoRa cannot use these public bands without restriction and must comply with a 1% duty cycle limit per hour [37,38,39]. The duty cycle limitation means that LoRa modules can transmit data for 36 s per hour on the same band. This limitation makes visual environment monitoring via LoRa challenging due to the low transmission speeds of LoRa modules.

In LoRa communication, the time on air and transmission range of the data increase at low transmission speeds. At high transmission speeds, the time on air and the transmission range of data decrease. For this reason, if transmission is desired over long range, data transmission must be performed at lower transmission speeds.

LoRa Transmission Conditions

In the study, LoRa transmissions were conducted using real hardware devices. E22 900T22S LoRa modules were installed on two Raspberry Pi 4, one as a receiver and one as a transmitter. 5 dBi antenna was used for both LoRa modules. Additionally, 10,000 mAh power banks were used for long-distance transmissions. The characteristics of the LoRa module used in the study are listed in Table 2.

Some of the experiments were conducted at close range by connecting Raspberry Pi 4 devices to displays. This way, data packets sent and received were monitored and packet loss was tested continuously. The data packet size for the LoRa module is 240 bytes. Six bytes of this constitute the header information, and the remaining 234 bytes are for data. Because image data, due to its large size, must be transmitted in multiple 234-byte packets, collisions occur if a new packet arrives while the receiver is processing the previous one. Therefore, to prevent packet loss, the LoRa sender and receiver codes include delay times after each packet transmission.

Once the experimental parameters were stable, image transmission was tested in an open area over a distance of 1500 m. The 868 MHz frequency band and 125 kHz, 2.4 kbps air data rate bandwidth were used for transmission. The LoRa transmission times in our study were calculated using these experiments. No packet loss was observed under these conditions.

2.3. Face-Focused Enhanced Super-Resolution Generative Adversarial Network (ESRGAN)

To train the super-resolution generative adversarial network developed in this study, we used 1000 images of CelebA face images dataset as the training dataset [30].

The images in the dataset consist of mixed-resolution images and have similar resolutions. Our ESRGAN model uses two image datasets for its training: high-resolution (HR) and low-resolution (LR). The original front-facing images served as an HR image dataset. The original images were downscaled by a factor of 4 to create a bicubic LR image dataset.

While the model is being trained, the generator network is fed with LR images and attempts to produce super-resolution (SR) images. The discriminator network is trained using the SR images generated and the original images from the HR dataset. The generator network attempts to deceive the discriminator network that the generated image is real. The discriminator network measures the fakeness of the generated image using a loss function (adversarial loss). This loss value is fed back to the generator network. As a result, the generator network attempts to produce more realistic images.

Figure 8 illustrates the generator architecture of the trained network. Unlike the SRGAN network, this model incorporates a Residual-in-Residual Dense Block architecture without batch normalization. Perceptual loss features have also been improved [5].

In our study, the training of the ESRGAN network is performed on a Windows Server 2016 virtual machine equipped with a 32-core CPU and 64 GB of RAM. The image dataset consists of 1000 face images. Training was performed over 20 epochs. The average PSNR in the final epoch was 27.17 dB. The average PSNR for 20 epochs is 24.52.

The methodology applied in the study for image enhancement in the receiver consists of 4 steps:

(Step 1) Creating an Image Training Set: For this stage, 1000 frontal face images were scanned in the open access CelebA face database [30] and prepared for training. The resolution of these HR images was reduced by a factor of 4 to obtain LR images. HR and LR images were used as the training dataset.

(Step 2) Training the ESRGAN Network: For this stage, HR and LR images were used as input to the generator and discriminator networks. The discriminator network fed back to the generator network using loss calculation to attempt to produce realistic images. Training was performed for 20 epochs, resulting in a face-focused ESRGAN network model.

(Step 3) Testing the Face-Focused ESRGAN Model: LR images transmitted from the sender’s LoRa module to the receiver’s LoRa node were fed into the trained network model as input. The ESRGAN network model produced 256 × 256 pixel SR images as output.

(Step 4) PSNR and SSIM Measurements and Results: The images produced during the testing phase were compared with the original images and calculated PSNR and SSIM measurements. The measurement results were evaluated.

Training Detail

The model trained in our study adopts the ESRGAN architecture. This architecture consists of a 23-block Residual-in-Residual-Dense Block (RRDB) generator and a multi-scale discriminator with skip connections. The model was trained using a combination of relativistic adversarial loss, content loss (MSE), and perceptual loss (VGG19-based L1). 1000 face images from the celebA face image dataset were used as the training dataset. 64 × 64 pixel patches were used for the LR image set and 256 × 256 pixel patches were used for the HR image set. The LR image set was obtained by downscaling the HR image set by 4× using the interpolation method. In the experiments, the 64 × 64 → 256 × 256 scaling aimed to maintain training stability and optimize PSNR values. Training was performed for 20 epochs with a batch size of 4 using Adam optimization (lr = 2 × 10⁻⁵) and StepLR scheduling (γ = 0.5 every 10 epochs).

3. Results and Discussion

In our study, the images were first downsampled by a factor of 10 to reduce the size of the high-resolution image data for transmission. Then, the images were compressed using JPEG and WebP compression techniques at different quality metrics (q = 10, 30, 60). The downsampled and compressed sample images were then converted to string format using base64 encoding for transmission via LoRa. Converting the data to a base64 string format converts the compressed JPEG and WebP binary data into ASCII text, ensuring compatibility with text-only systems while reversibly preserving the content. The original image used in the study, along with its size and transmission time via LoRa, is listed in Table 3.

The original image was intended to be transmitted as a string of data using only base64 string format, without any processing. This original image has a data size of 489,033 bytes and is transmitted over LoRa in 2090 packets over 42 min. Both the 1% duty cycle limitations and the long transmission times required for a face image demonstrate the difficulty of direct image transmission over LoRa. Therefore, reducing image transmission times will facilitate the use of LoRa transmission for environmental monitoring in environments requiring long-distance transmission without internet access.

The transmission times over LoRa of sample images compressed with JPEG and WebP after 10-× downsampling are shown in Table 4. According to the results in Table 4, WebP compression is the most suitable compression technique among JPEG and WebP compression techniques. WebP compression exhibits less distortion perceived by the human eye than JPEG, resulting in better compression. Each sample image compressed with different quality metrics was transmitted from the sender LoRa node to the receiver LoRa node over the LoRa network. The transmission time results show that WebP compression provides faster transmission, directly proportional to compression performance. While the direct transmission time of the original image was 42 min (2520 s), the image compressed with WebP compression at q = 10 quality was transmitted in 4.51 s. Based on these results, an improvement of approximately 99.82% is achieved in transmission time.

All transmission times and packet counts reported in this study (Table 4 and Table 5) are based not on the original binary data size, but on the Base64-encoded string size, which introduces an additional data size overhead of approximately 33%. For example, a 0.805 KB WebP (q = 10, downsampling = yes) binary file (Table 1) results in a Base64 string of 1101 bytes, and our transmission calculations (4.51 s) are based on this 1101-byte size (Table 4). Although direct binary packaging appears to be more efficient in terms of data size, Base64 was preferred for text-based communication due to its simplicity, compatibility, and robustness against transmission errors.

Additionally, satisfactory results were achieved in compliance with the duty cycle limit of only 1% data transmission per hour. Under this constraint, a total of 36 s of data transmission per hour can be achieved. An evaluation using a sample image compressed with WebP (q = 10) shows that a single image can be transmitted in approximately 4.51 s. Based on this transmission time, approximately seven images per hour can be transmitted for images of similar sizes.

In the next phase of the study, the image received by the receiver node is reproduced with the trained ESRGAN model. Additionally, low-resolution face images are reproduced using the Real-ESRGAN model to demonstrate the comparative performance of super-resolution GAN models. Figure 9 shows the results of the ESRGAN and Real-ESRGAN models applied to low-resolution images compressed with JPEG. Figure 10 shows the results of the ESRGAN and Real-ESRGAN models applied to sample images compressed with WebP. The generated images are evaluated with the original images using PSNR and SSIM metrics.

To compare the images produced by our own trained ESRGAN model, SR images were also produced using the ready-made Real-ESRGAN model [40]. Figure 9 shows the SR image outputs generated by the ESRGAN model and the Real-ESRGAN pre-built model. The first image in each row is the image that was first downsampled (10×) and compressed with JPEG at q = 10, q = 30 and q = 60 respectively and transmitted to the receiver. ESRGAN and Real-ESRGAN were applied to these three images with different quality ratios. These images were compared with the original images in terms of PSNR and SSIM metrics. Since LoRa transmission requires the lowest data size, if we evaluate the image created by ESRGAN using a JPEG-compressed LR image at q = 10 quality, the PSNR value is 23.03 dB and the SSIM value is 0.6543. For Real-ESRGAN, the PSNR value is 23.12 dB and the SSIM value is 0.6579.

The SR images generated by ESRGAN and Real-ESRGAN from LR images with WebP compression shown in Figure 10, were compared in terms of PSNR and SSIM metrics. The PSNR and SSIM values of the SR images generated by ESRGAN and Real-ESRGAN are pretty close. When examining these values for an image compressed with WebP at a quality value of q = 10, the PSNR value of the image generated by ESRGAN is 23.47 dB and the SSIM value is 0.6650. For Real-ESRGAN, the PSNR is 23.35 dB and the SSIM is 0.6676. These results demonstrate that the ESRGAN network we trained under limited hardware conditions can achieve results similar to, and sometimes superior to, the Real-ESRGAN network.

While the results from the Real-ESRGAN model are slightly higher than those from the ESRGAN model, the PSNR and SSIM results for both models are close in value. When evaluated by human perception, the images produced by Real-ESRGAN appear more artificial. The ESRGAN model produces more realistic images. However, due to issues such as insufficient training time caused by the computer infrastructure used for training the ESRGAN model (graphics card, CPU-RAM insufficiency) and disconnections during long epochs, the number of epochs was kept short. Additionally, due to the crashes that occurred during the lengthy training process, the dataset’s image count (1000 images) was kept low. However, compared to Real-ESRGAN, the face-focused ESRGAN model trained in our study also yielded good results. Specifically, as seen in Figure 10, the PSNR value of the image generated with ESRGAN is higher than that generated with Real-ESRGAN for the WebP compressed image at a quality metric of 10.

According to the test results, the transmission times of images compressed with JPEG and WebP compression formats via LoRa are shown graphically in Figure 11. The graph shows that transmission times for images compressed in WebP format are shorter than for images compressed in JPEG format.

Figure 12 shows the PSNR changes that occurred during the 20 epochs of ESRGAN model training. The PSNR values increase linearly at each epoch step. This shows that the ESRGAN model improved linearly throughout training.

When compression methods are evaluated according to experimental results, WebP compression provides a better compression ratio and lower distortion compared to JPEG compression. Accordingly, LoRa transmission times for WebP-compressed images have a shorter duration than those for JPEG compression. Additionally, the SR image results obtained from the WebP compressed image are deemed superior to those from the JPEG-compressed image. The WebP compression method, which offers advantages in every aspect, is considered the most suitable compression method in this research for image transmission using LoRa.

In Figure 13, image quality is evaluated using the Identity Preservation (IP) and Learned Perceptual Image Patch Similarity (LPIPS) metrics, in addition to PSNR and SSIM, which measure pixel and structural similarity. Of these metrics, LPIPS measures the visual closeness of the generated images to the original images and provides a quality assessment closer to human perception. In this study, an AlexNet-based feature extraction model is used to measure perceptual similarity. Unlike other metrics, lower values for LPIPS are more efficient. IP is a metric that measures how well a reconstructed face image preserves the identity of the original person. For this metric, face embeddings are extracted using the ArcFace-based InsightFace model and evaluated using cosine similarity. Figure 13 shows the evaluation performed with 3 different face images selected from the Human Dataset [29], which is different from the database used in GAN training. Within the scope of this evaluation, PSNR, SSIM, IP, and LPIPS metrics were calculated for WebP compression, JPEG compression, face-oriented ESRGAN, and Real-ESRGAN models. The three face images selected for this evaluation were compressed to a quality level of 10 using WebP compression, which is believed to provide more efficient compression. Each image was then reconstructed using both the face-focused ESRGAN and Real-ESRGAN models.

Assessing the metric results in Figure 13, WebP compression results are consistent with WebP’s high compression and low distortion approach. It provides good results in terms of PSNR and identity protection.

JPEG is a lossy compression method. This method exhibits high image distortion. The results of this method show that compared to SR-based methods, it achieves moderate PSNR, SSIM, and identity protection. It is the least efficient method for LPIPS, which uses human perception as a metric.

Real-ESRGAN achieves the highest value in SSIM and the lowest in LPIPS, demonstrating its superiority in both structural and perceptual quality.

The ESRGAN model, despite being trained under limited hardware (CPU) conditions, performed better than classical compression methods in terms of both structural similarity (SSIM) and perceptual quality (LPIPS), achieving results comparable to Real-ESRGAN.

Table 5 shows the data sizes, packet counts, and LoRa transmission times for the original and WebP compressed after 4× downsampling formats of the images in Figure 13. Accordingly, 18,343 bytes of image data are transmitted in approximately one and a half minutes (94.99 s) without any processing. This time exceeds the duty cycle, and it is a relatively long time for data transmission. When this image is reduced to 853 bytes with WebP compression, the transmission time is reduced to 4.81 s. This provides an approximately 95.19% improvement in transmission time. The second image experienced a 98.77% improvement (reduction) in transmission time, and the third image experienced a 99.08% improvement.

For the overall evaluation of the study, a mini-performance evaluation study was conducted using 15 different face images from the Human Dataset [29], including the three images in Figure 13. Images obtained using ESRGAN, Real-ESRGAN, JPEG compression, and WebP compression techniques were evaluated using PSNR, SSIM, LPIPS, and IP metrics.

In addition, the Mean Opinion Score (MOS) evaluation, which measures human observers’ perception of quality, was also conducted in this section. MOS scoring was performed using Google Forms. Fifteen images obtained using ESRGAN, Real-ESRGAN, JPEG compression, and WebP compression techniques were presented along with the original image for each image. Participants were asked to evaluate each image against its original image. The MOS score for each image generation technique was determined by evaluating the scoring results of 63 participants. The same 15 images were used for the other metric methods. The averages of the evaluation results were calculated, and all results are presented in Table 6.

Higher PSNR, SSIM, IP, and MOS values in the evaluation indicate better preservation of the structural integrity, identification information, and perceptual quality of the images. Conversely, lower LPIPS values indicate that the generated images are perceptually closer to the original images. Therefore, an ideal model should have high PSNR, SSIM, IP, and MOS values and low LPIPS scores.

According to the results, Real-ESRGAN achieved the highest average PSNR value (25.29 dB). Next, WebP has an average PSNR value (25.1 dB). Since it is a compression method that causes little distortion, this result is expected. ESRGAN (24.84 dB) has produced a competitive result despite being trained under CPU limitations. JPEG shows the most noise and information loss with the lowest average PSNR (24.37 dB).

SSIM values are highest in Real-ESRGAN at 0.66. The ESRGAN model yielded a very close result at 0.63. This demonstrates that SR-based methods successfully generalize in preserving structural similarity compared to traditional compression methods.

The LPIPS value measures perceptual quality, with lower values indicating higher visual similarity. Real-ESRGAN (0.24) provided the highest perceptual quality with the lowest average LPIPS value. WebP (0.32) came in second, while ESRGAN (0.36) performed competitively despite being trained under CPU constraints.

When the IP metric is evaluated, all methods provide average identity protection. However, the WebP method achieved the highest kimlik koruma value and largely preserved identity integrity after compression. However, SR-based methods need further improvement in identity protection compared to traditional methods.

In subjective evaluation, Real-ESRGAN has the highest average MOS score (3.85). ESRGAN ranked second with a score of 3.46 and was observed to produce satisfactory results in terms of visual quality according to human perception despite CPU limitations. Among traditional methods, WebP (3.26), which causes less distortion, is better than the JPEG (2.78) method.

Latency and Embedded System Feasibility

A practical challenge for this system is the total latency on the receiver node, which is a constrained embedded device (Raspberry Pi 4). We analyzed the total latency as two components: (1) Transmission Latency and (2) Inference Latency.

The transmission latency for the WebP (q = 10) image was 4.51 s. We then measured the Inference Latency required for our trained ESRGAN model to perform SR on the received image. On the Raspberry Pi 4, this process took an average of 10 s.

Therefore, the total end-to-end latency (Transmission + Inference) is approximately 14.51 s. While this is unsuitable for real-time video, it is highly feasible for the target applications of non-real-time remote security alerts or identity verification checks.

4. Conclusions

This study designed and demonstrated an end-to-end system for transmitting facial images over LoRa technology in environments without internet access. LoRa transmission offers a simple, functional, low-energy, and cost-effective solution for the intermittent transmission of data collected by sensors. However, it has some difficulties in terms of visual environment monitoring. We combined aggressive downsampling with lossy compression (JPEG and WebP) to address LoRa’s challenges, such as its limited data size and duty cycle limitations. WebP compression (q = 10) was found to be superior, reducing the size of a 358 KB sample image to 0.805 KB (1.1 KB after Base64 encoding). This reduction reduced transmission time from 42 min to 4.51 s, enabling the system to operate effectively within the 1% duty cycle limitation.

At the receiver, a face-based ESRGAN model was trained to restore low-resolution images. This model was evaluated against Real-ESRGAN and traditional compression methods (JPEG and WebP) using PSNR, SSIM, LPIPS, IP, and MOS metrics. Experimental results demonstrate that the Real-ESRGAN method achieves the high performance across both objective and subjective metrics. Despite CPU-based training and limited data, the ESRGAN model achieved higher perceptual and subjective quality compared to traditional compression methods like WebP and JPEG. These results demonstrate the potential of the ESRGAN architecture and demonstrate that its performance can be further improved under optimal hardware conditions. The study confirms that image transmission over LoRa is feasible when combined with modern compression and super-resolution techniques.

4.1. Limitations

In our study, the ESRGAN model was trained on a Windows Server 2016 virtual machine equipped with a 32-core CPU and 64 GB of RAM, using 1000 face images for 20 epochs. This constitutes the most significant limitation of our study. While the results demonstrate the applicability of the method, due to the limited hardware and data coverage, the trained model serves as a proof of concept and does not reflect the model’s full potential performance.

Given these limitations, the primary objective of our study is to demonstrate the applicability and conceptual validity of the method.

4.2. Future Work

The first planned future work is to develop a more robust, face-focused ESRGAN model that utilizes sufficient GPU resources for training on a much larger dataset and more epochs. Second, we will train new SR models specifically for environmental datasets (e.g., wildlife, field, or agricultural monitoring) to address the scope limitation. Third, we will investigate using direct binary packing to eliminate the 33% overhead caused by Base64 encoding. Finally, the robustness of the proposed system under more complex channel conditions will be investigated. In particular, the impact of channel noise, packet loss, and partial image occlusion on system performance will be analyzed.

Author Contributions

Conceptualization, F.Y.D. and B.B.; methodology, F.Y.D.; software, F.Y.D.; validation, F.Y.D. and B.B.; formal analysis, B.B.; investigation, F.Y.D. and B.B.; resources, F.Y.D. and B.B.; data curation, F.Y.D.; writing—original draft preparation, B.B.; writing—review and editing, F.Y.D.; visualization, F.Y.D.; supervision, B.B.; project administration, B.B.; funding acquisition, B.B. and F.Y.D. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by Erciyes University Scientific Research Projects Coordination Unit within the scope of project number FDK-2023-12503.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author/s.

Acknowledgments

The authors thank the Erciyes University Dean of Research for providing the necessary infrastructure and laboratory facilities at the ArGePark research building.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Guerra, K.; Casavilca, J.; Huamán, S.; López, L.; Sanchez, A.; Kemper, G. A low-rate encoder for image transmission using LoRa communication modules. Int. J. Inf. Technol. 2023, 15, 1069–1079. [Google Scholar] [CrossRef]
Kirichek, R.; Pham, V.-D.; Kolechkin, A.; Al-Bahri, M.; Paramonov, A. Transfer of Multimedia Data via LoRa. In Internet of Things, Smart Spaces, and Next Generation Networks and Systems; Springer: Cham, Switzerland, 2017; Volume 10531, pp. 708–720. [Google Scholar] [CrossRef]
Wei, C.-C.; Su, P.-Y.; Chen, S.-T. Comparison of the LoRa Image Transmission Efficiency Based on Different Encoding Methods. Int. J. Inf. Electron. Eng. 2020, 10, 1–4. [Google Scholar] [CrossRef]
Ji, M.; Yoon, J.; Choo, J.; Jang, M.; Smith, A. LoRa-based Visual Monitoring Scheme for Agriculture IoT. In Proceedings of the IEEE Sensors Applications Symposium (SAS), Sophia Antipolis, France, 11–13 March 2019; pp. 1–6. [Google Scholar] [CrossRef]
Chen, T.; Eager, D.; Makaroff, D. Efficient Image Transmission Using LoRa Technology In Agricultural Monitoring IoT Systems. In Proceedings of the International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), Atlanta, GA, USA, 14–17 July 2019; pp. 937–944. [Google Scholar] [CrossRef]
Haron, M.H.; Isa, M.N.; Ahmad, M.I.; Ismail, R.C.; Ahmad, N. Image Data Compression Using Discrete Cosine Transform Technique for Wireless Transmission. Int. J. Nanoelectron. Mater. 2021, 14, 289–297. [Google Scholar]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. In Proceedings of the Advances Neural Information Processing Systems Conference, Montreal, QC, Canada, 8–13 December 2014; pp. 2672–2680. [Google Scholar]
Gm, H.; Gourisaria, M.K.; Pandey, M.; Rautaray, S.S. A comprehensive survey and analysis of generative models in machine learning. Comput. Sci. Rev. 2020, 38, 100285. [Google Scholar] [CrossRef]
Radford, A.; Metz, L.; Chintala, S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv 2015, arXiv:1511.06434. [Google Scholar] [CrossRef]
Ledig, C.; Theis, L.; Huszar, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 105–114. [Google Scholar] [CrossRef]
Wang, X.; Yu, K.; Wu, S.; Gu, J.; Liu, Y.; Dong, C.; Change Loy, C.; Qiao, Y.; Tang, X. Esrgan: Enhanced super-resolution generative adversarial networks. In Proceedings of the European conference on computer vision (ECCV) workshops, Munich, Germany, 8–14 September 2018; pp. 63–79. [Google Scholar] [CrossRef]
Harris, J.L. Diffraction and resolving power. J. Opt. Soc. Am. 1964, 54, 931–936. [Google Scholar] [CrossRef]
Tsai, R.Y.; Huang, T.S. Multiframe image restoration and registration. Adv. Comput. Vision. Image Process. 1984, 1, 317–339. [Google Scholar]
Yue, L.; Shen, H.; Li, J.; Yuan, Q.; Zhang, H.; Zhang, L. Image super-resolution: The techniques, applications, and future. Signal Process. 2016, 128, 389–408. [Google Scholar] [CrossRef]
Kim, J.; Lee, J.K.; Lee, K.M. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1646–1654. [Google Scholar] [CrossRef]
Lai, W.-S.; Huang, J.-B.; Ahuja, N.; Yang, M.-H. Deep laplacian pyramid networks for fast and accurate super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 624–632. [Google Scholar] [CrossRef]
Zhang, Y.; Li, K.; Li, K.; Wang, L.; Zhong, B.; Fu, Y. Image super-resolution using very deep residual channel attention networks. In Proceedings of the European conference on computer vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 286–301. [Google Scholar] [CrossRef]
Zhuang, Y.; Zhang, J.; Wu, F. Hallucinating faces: LPH super-resolution and neighbor reconstruction for residue compensation. Pattern Recognit. 2007, 40, 3178–3194. [Google Scholar] [CrossRef]
Wang, X.; Tang, X. Hallucinating face by eigentransformation. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 2005, 35, 425–434. [Google Scholar] [CrossRef]
Wang, X.; Xie, L.; Dong, C.; Shan, Y. Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 11–17 October 2021; pp. 1905–1914. [Google Scholar] [CrossRef]
Sun, Z.; Ng, C.K.C. Finetuned Super-Resolution Generative Adversarial Network (Artificial Intelligence) Model for Calcium Deblooming in Coronary Computed Tomography Angiography. J. Pers. Med. 2022, 12, 1354. [Google Scholar] [CrossRef] [PubMed]
Nandal, P.; Pahal, S.; Khanna, A.; Pinheiro, P.R. Super-Resolution of Medical Images Using Real ESRGAN. IEEE Access 2024, 12, 176155–176170. [Google Scholar] [CrossRef]
Agarwal, V.; Lohani, M.C.; Singh Bist, A.; Rahardja, L.; Hardini, M.; Mustika, G. Deep CNN–Real ESRGAN: An Innovative Framework for Lung Disease Prediction. In Proceedings of the IEEE Creative Communication and Innovative Technology (ICCIT), Tangerang, Indonesia, 22–23 November 2022; pp. 1–6. [Google Scholar] [CrossRef]
Li, R.; Zhou, W. Image Super-Resolution Reconstruction of Landslide Based on Real-ESRGAN. In Proceedings of the 2024 8th International Conference on Control Engineering and Artificial Intelligence (CCEAI ′24), New York, NY, USA, 26–28 January 2024; pp. 203–208. [Google Scholar] [CrossRef]
Aghelan, A.; Rouhani, M. Underwater Image Super-Resolution using Generative Adversarial Network-based Model. In Proceedings of the 13th International Conference on Computer and Knowledge Engineering (ICCKE), Mashhad, Iran, 1–2 November 2023; pp. 480–484. [Google Scholar] [CrossRef]
Andriadi, K.; Zarlis, M.; Heryadi, Y.; Chowanda, A. Evaluation of REAL-ESRGAN Using Different Types of Image Degradation. In Proceedings of the International Conference on ICT for Smart Society (ICISS), Bandung, Indonesia, 4–5 September 2024; pp. 1–6. [Google Scholar] [CrossRef]
He, Q.; Yuan, H.; Feng, D.; Che, B.; Chen, Z.; Xia, X.-G. Robust Semantic Transmission of Images with Generative Adversarial Networks. In Proceedings of the GLOBECOM—IEEE Global Communications Conference, Rio de Janeiro, Brazil, 4–8 December 2022; pp. 3953–3958. [Google Scholar] [CrossRef]
Erdemir, E.; Tung, T.-Y.; Dragotti, P.L.; Gündüz, D. Generative Joint Source-Channel Coding for Semantic Image Transmission. IEEE J. Sel. Areas Commun. 2023, 41, 2645–2657. [Google Scholar] [CrossRef]
Gupta, A.; Kaggle. Human Faces Dataset. Available online: https://www.kaggle.com/datasets/ashwingupta3012/human-faces (accessed on 10 September 2025).
Li, J.; Kaggle. CelebFaces Attributes (CelebA) Dataset. Available online: https://www.kaggle.com/datasets/jessicali9530/celeba-dataset (accessed on 20 June 2025).
García, L.; Cancimance, C.; Asorey-Cacheda, R.; Zúñiga-Cañón, C.-L.; Garcia-Sanchez, A.-J.; Garcia-Haro, J. Compliant and Seamless Hybrid (Star and Mesh) Network Topology Coexistence for LoRaWAN: A Proof of Concept. Appl. Sci. 2025, 15, 3487. [Google Scholar] [CrossRef]
Alhomyani, H.; Fadel, M.; Dimitriou, N.; Bakhsh, H.; Aldabbagh, G. Modeling the Performance of a Multi-Hop LoRaWAN Linear Sensor Network for Energy-Efficient Pipeline Monitoring Systems. Appl. Sci. 2024, 14, 9391. [Google Scholar] [CrossRef]
Ariyoshi, R.; Li, A.; Hasegawa, M.; Ohtsuki, T. Energy-Efficient Resource Allocation Scheme Based on Reinforcement Learning in Distributed LoRa Networks. Sensors 2025, 25, 4996. [Google Scholar] [CrossRef] [PubMed]
Abdallah, B.; Khriji, S.; Chéour, R.; Lahoud, C.; Moessner, K.; Kanoun, O. Improving the Reliability of Long-Range Communication against Interference for Non-Line-of-Sight Conditions in Industrial Internet of Things Applications. Appl. Sci. 2024, 14, 868. [Google Scholar] [CrossRef]
Zhalmagambetova, U.; Neftissov, A.; Biloshchytskyi, A.; Kazambayev, I.; Shimpf, A.; Kazhibekov, M.; Snopkov, D. A Secure Telemetry Transmission Architecture Independent of GSM: An Experimental LoRa-Based System on Raspberry Pi for IIoT Monitoring Tasks. Appl. Sci. 2025, 15, 9539. [Google Scholar] [CrossRef]
Alhattab, N.; Bouabdallah, F.; Khairullah, E.F.; Aseeri, A. Reinforcement Learning-Based Time-Slotted Protocol: A Reinforcement Learning Approach for Optimizing Long-Range Network Scalability. Sensors 2025, 25, 2420. [Google Scholar] [CrossRef] [PubMed]
ETSI. ETSI TR 103 526 V.1.1.1 Technical Report. Available online: https://www.etsi.org/deliver/etsi_tr/103500_103599/103526/01.01.01_60/tr_103526v010101p.pdf (accessed on 10 June 2025).
The Things Network. Duty Cycle. Available online: https://www.thethingsnetwork.org/docs/lorawan/duty-cycle/ (accessed on 15 June 2025).
Romero Vázquez, J.L.; García-Barrientos, A.; Del-Puerto-Flores, J.A.; Castillo Soria, F.R.; Ibarra-Hernández, R.F.; Pineda Rico, U.; Zambrano-Serrano, E. Verification of a Probabilistic Model and Optimization in Long-Range Networks. Appl. Sci. 2025, 15, 1873. [Google Scholar] [CrossRef]
Wang, X. Real-ESRGAN. GitHub Repository. 2021. Available online: https://github.com/xinntao/Real-ESRGAN (accessed on 20 June 2025).

Figure 1. Basic model of a generative adversarial network.

Figure 2. Workflow of the study.

Figure 3. The methodology of the study.

Figure 4. JPEG Compression Results According to Quality Levels. (Image sizes: original → 358 KB, q = 60 → 164 KB, q = 30 → 101 KB, q = 10 → 51.9 KB).

Figure 5. Downsampling followed by JPEG compression. (Image sizes: original → 358 KB, q = 60 → 3.04 KB, q = 30 → 2.16 KB, q = 10 → 1.43 KB).

Figure 6. WebP Compression Results According to Quality Levels. (Image sizes: original → 358 KB, q = 60 → 95.9 KB, q = 30 → 58.7 KB, q = 10 → 33.7 KB).

Figure 7. Downsampling followed by WebP compression. (Image sizes: original → 358 KB, q = 60 → 1.75 KB, q = 30 → 1.20 KB, q = 10 → 0.805 KB).

Figure 8. The architecture of the ESRGAN model.

Figure 9. Application of ESRGAN and Real-ESRGAN models to JPEG compressed images.

Figure 10. Application of ESRGAN and Real-ESRGAN models to WebP compressed images.

Figure 11. Transmission times of compressed images via LoRa.

Figure 12. PSNR results for each epoch during ESRGAN model training.

Figure 13. Application of ESRGAN and Real-ESRGAN models to different face images.

Table 1. Comparative Compression Results of the 358 KB Original Image.

Downsampling	Compression Quality	Compression Method
Downsampling	Compression Quality	JPEG	WebP
No	q = 10	51.9 KB	33.7 KB
	q = 30	101 KB	58.7 KB
	q = 60	164 KB	95.9 KB
Yes (10×)	q = 10	1.43 KB	0.805 KB
	q = 30	2.16 KB	1.20 KB
	q = 60	3.04 KB	1.75 KB

Table 2. LoRa Module Specification.

Parameter	Specification
Model	E22-900T22S
Frequency Range	868–915 MHz
Data Packet Size	240 Byte
Air Date Rate	0.3 kbps (min), 2.4 kbps (typical), 62.5 kbps (max)
Max Transmission distance	7000 m (Test condition: clear and open area, antenna gain: 5 dBi, antenna height: 2.5 m, air data rate: 2.4 kbps)
Antenna gain	5 dBi

Table 3. Data Size and Transmission Time for the Original Image.

Original Image	Base64 Data Size	Number of Packets	Transmission Time
	489,033 bytes	2090	42 min

Table 4. LoRa Transmission Times Using JPEG and WebP Compression.

	Downsampling (10×) + JPEG			Downsampling (10×) + WebP
Quality metrics	q = 10	q = 30	q = 60	q = 10	q = 30	q = 60
Images
Base64 Data Size (bytes)	1961 bytes	2953 bytes	4161 bytes	1101 bytes	1645 bytes	2401 bytes
Number of Packets	9	13	18	5	8	11
Transmission Time (s)	8.12 s	11.72 s	16.24 s	4.51 s	7.22 s	9.92 s

Table 5. LoRa Transmission Times using WebP Compression for 3 different images.

	Original Image			Downsampling (4×) + Image Compressed with WebP
Image	Base64 Data Size (bytes)	Transmission time	Number of packets	Base64 Data Size (bytes)	Transmission time	Number of packets
1	18,343	94.99 s	79	853	4.81 s	4
2	95,241	491.8 s	409	1053	6.02 s	5
3	201,993	1042.53 s	867	1777	9.62 s	8

Table 6. Comparison of image generation methods using multiple evaluation metrics.

Metot	PSNR	SSIM	LPIPS	IP	MOS
WebP	25.10	0.61	0.32	0.80	3.26
JPEG	24.37	0.59	0.42	0.76	2.78
ESRGAN	24.84	0.63	0.36	0.69	3.46
Real-ESRGAN	25.29	0.66	0.24	0.76	3.85

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Babayiğit, B.; Yarlı Doğan, F. Improving Face Image Transmission with LoRa Using a Generative Adversarial Network. Appl. Sci. 2025, 15, 11767. https://doi.org/10.3390/app152111767

AMA Style

Babayiğit B, Yarlı Doğan F. Improving Face Image Transmission with LoRa Using a Generative Adversarial Network. Applied Sciences. 2025; 15(21):11767. https://doi.org/10.3390/app152111767

Chicago/Turabian Style

Babayiğit, Bilal, and Fatma Yarlı Doğan. 2025. "Improving Face Image Transmission with LoRa Using a Generative Adversarial Network" Applied Sciences 15, no. 21: 11767. https://doi.org/10.3390/app152111767

APA Style

Babayiğit, B., & Yarlı Doğan, F. (2025). Improving Face Image Transmission with LoRa Using a Generative Adversarial Network. Applied Sciences, 15(21), 11767. https://doi.org/10.3390/app152111767

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Improving Face Image Transmission with LoRa Using a Generative Adversarial Network

Abstract

1. Introduction

2. Materials and Methods

2.1. Compression of the Image to Be Transmitted with LoRa

2.1.1. JPEG Compression

2.1.2. WEBP Compression

2.2. Image Transmission with LoRa

LoRa Transmission Conditions

2.3. Face-Focused Enhanced Super-Resolution Generative Adversarial Network (ESRGAN)

Training Detail

3. Results and Discussion

Latency and Embedded System Feasibility

4. Conclusions

4.1. Limitations

4.2. Future Work

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI