Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (56)

Search Parameters:
Keywords = ESRGAN

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 6884 KB  
Article
A Comparative Evaluation of Super-Resolution Methods for Spectral Images Using Pretrained RGB Models
by Navid Shokoohi, Abdelhamid N. Fsian, Jean-Baptiste Thomas and Pierre Gouton
Sensors 2026, 26(2), 683; https://doi.org/10.3390/s26020683 - 20 Jan 2026
Abstract
The spatial resolution of spectral imaging systems is fundamentally constrained by hardware trade-offs, and the availability of large-scale annotated spectral datasets remains limited. This study presents a comprehensive evaluation of super-resolution (SR) methods across interpolation-based, CNN-based, GAN-based, and diffusion-based approaches. Using a synthetic [...] Read more.
The spatial resolution of spectral imaging systems is fundamentally constrained by hardware trade-offs, and the availability of large-scale annotated spectral datasets remains limited. This study presents a comprehensive evaluation of super-resolution (SR) methods across interpolation-based, CNN-based, GAN-based, and diffusion-based approaches. Using a synthetic 30-band spectral representation reconstructed from RGB with the MST++ model as a proxy ground truth, we arrange non-adjacent triplets as three-channel PNG inputs to ensure compatibility with existing SR architectures. A unified pipeline enables reproducible evaluation at ×2, ×4, and ×8 scales on 50 unseen images, with performance assessed using PSNR, SSIM, and SAM. Results confirm that bicubic interpolation remains a spectrally reliable baseline; shallow CNNs (SRCNN, FSRCNN) generalize well without fine-tuning; and ESRGAN improves spatial detail at the expense of spectral accuracy. Diffusion models (SR3, ResShift, SinSR), evaluated in a zero-shot setting without spectral-domain adaptation, exhibit unstable performance and require spectrum-aware training to preserve spectral structure effectively. The findings underscore a persistent trade-off between perceptual sharpness and spectral fidelity, highlighting the importance of domain-aware objectives when applying generative SR models to spectral data. This work provides reproducible baselines and a flexible evaluation framework to support future research in spectral image restoration. Full article
(This article belongs to the Special Issue Feature Papers in Sensing and Imaging 2025&2026)
Show Figures

Figure 1

22 pages, 2302 KB  
Article
MAF-GAN: A Multi-Attention Fusion Generative Adversarial Network for Remote Sensing Image Super-Resolution
by Zhaohe Wang, Hai Tan, Zhongwu Wang, Jinlong Ci and Haoran Zhai
Remote Sens. 2025, 17(24), 3959; https://doi.org/10.3390/rs17243959 - 7 Dec 2025
Viewed by 450
Abstract
Existing Generative Adversarial Networks (GANs) frequently yield remote sensing images with blurred fine details, distorted textures, and compromised spatial structures when applied to super-resolution (SR) tasks, so this study proposes a Multi-Attention Fusion Generative Adversarial Network (MAF-GAN) to address these limitations: the generator [...] Read more.
Existing Generative Adversarial Networks (GANs) frequently yield remote sensing images with blurred fine details, distorted textures, and compromised spatial structures when applied to super-resolution (SR) tasks, so this study proposes a Multi-Attention Fusion Generative Adversarial Network (MAF-GAN) to address these limitations: the generator of MAF-GAN is built on a U-Net backbone, which incorporates Oriented Convolutions (OrientedConv) to enhance the extraction of directional features and textures, while a novel co-calibration mechanism—incorporating channel, spatial, gating, and spectral attention—is embedded in the encoding path and skip connections, supplemented by an adaptive weighting strategy to enable effective multi-scale feature fusion, and a composite loss function is further designed to integrate adversarial loss, perceptual loss, hybrid pixel loss, total variation loss, and feature consistency loss for optimizing model performance; extensive experiments on the GF7-SR4×-MSD dataset demonstrate that MAF-GAN achieves state-of-the-art performance, delivering a Peak Signal-to-Noise Ratio (PSNR) of 27.14 dB, Structural Similarity Index (SSIM) of 0.7206, Learned Perceptual Image Patch Similarity (LPIPS) of 0.1017, and Spectral Angle Mapper (SAM) of 1.0871, which significantly outperforms mainstream models including SRGAN, ESRGAN, SwinIR, HAT, and ESatSR as well as exceeds traditional interpolation methods (e.g., Bicubic) by a substantial margin, and notably, MAF-GAN maintains an excellent balance between reconstruction quality and inference efficiency to further reinforce its advantages over competing methods; additionally, ablation studies validate the individual contribution of each proposed component to the model’s overall performance, and this method generates super-resolution remote sensing images with more natural visual perception, clearer spatial structures, and superior spectral fidelity, thus offering a reliable technical solution for high-precision remote sensing applications. Full article
(This article belongs to the Section Environmental Remote Sensing)
Show Figures

Figure 1

21 pages, 5246 KB  
Article
Improving Face Image Transmission with LoRa Using a Generative Adversarial Network
by Bilal Babayiğit and Fatma Yarlı Doğan
Appl. Sci. 2025, 15(21), 11767; https://doi.org/10.3390/app152111767 - 4 Nov 2025
Cited by 1 | Viewed by 1242
Abstract
Although it is a technology that can be pretty important for remote areas lacking internet or cellular data, the difficulties it presents in large data transmission prevent LoRa from developing sufficiently for image transmission. This challenge is particularly relevant for applications requiring the [...] Read more.
Although it is a technology that can be pretty important for remote areas lacking internet or cellular data, the difficulties it presents in large data transmission prevent LoRa from developing sufficiently for image transmission. This challenge is particularly relevant for applications requiring the transfer of facial images, such as remote security or identification. It is possible to overcome these difficulties by reducing the data size through the application of various image processing methods. In the study, the face-focused enhanced super-resolution generative adversarial network (ESRGAN) is trained to address the significant quality loss in low-resolution face images transmitted to the receiver as a result of image processing techniques. Also, the trained ESRGAN model is evaluated comparatively with the Real-ESRGAN model and a standard bicubic interpolation baseline. In addition to Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) metrics, Learned Perceptual Image Patch Similarity (LPIPS) for perceptual quality and a facial identity preservation metric are used to calculate the similarities of the produced super-resolution (SR) images to the original images. The study was tested in practice, demonstrating that a facial image transmitted in 42 min via LoRa can be transmitted in 5 s using image processing techniques and that the images can be improved close to the real images at the receiver. Thus, with an integrated system that enhances the transmitted visual data, it becomes possible to transmit compressed, low-resolution image data using LoRa. The study aims to contribute to remote security or identification studies in regions with difficult internet and cellular data transmission by making significant improvements in image transmission with LoRa. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

14 pages, 3652 KB  
Article
Enhancing Mobility for the Blind: An AI-Powered Bus Route Recognition System
by Shehzaib Shafique, Gian Luca Bailo, Monica Gori, Giulio Sciortino and Alessio Del Bue
Algorithms 2025, 18(10), 616; https://doi.org/10.3390/a18100616 - 30 Sep 2025
Viewed by 677
Abstract
Vision is a critical component of daily life, and its loss significantly hinders an individual’s ability to navigate, particularly when using public transportation systems. To address this challenge, this paper introduces a novel approach for accurately identifying bus route numbers and destinations, designed [...] Read more.
Vision is a critical component of daily life, and its loss significantly hinders an individual’s ability to navigate, particularly when using public transportation systems. To address this challenge, this paper introduces a novel approach for accurately identifying bus route numbers and destinations, designed to assist visually impaired individuals in navigating urban transit networks. Our system integrates object detection, image enhancement, and Optical Character Recognition (OCR) technologies to achieve reliable and precise recognition of bus information. We employ a custom-trained You Only Look Once version 8 (YOLOv8) model to isolate the front portion of buses as the region of interest (ROI), effectively eliminating irrelevant text and advertisements that often lead to errors. To further enhance accuracy, we utilize the Enhanced Super-Resolution Generative Adversarial Network (ESRGAN) to improve image resolution, significantly boosting the confidence of the OCR process. Additionally, a post-processing step involving a pre-defined list of bus routes and the Levenshtein algorithm corrects potential errors in text recognition, ensuring reliable identification of bus numbers and destinations. Tested on a dataset of 120 images featuring diverse bus routes and challenging conditions such as poor lighting, reflections, and motion blur, our system achieved an accuracy rate of 95%. This performance surpasses existing methods and demonstrates the system’s potential for real-world application. By providing a robust and adaptable solution, our work aims to enhance public transit accessibility, empowering visually impaired individuals to navigate cities with greater independence and confidence. Full article
(This article belongs to the Section Combinatorial Optimization, Graph, and Network Algorithms)
Show Figures

Figure 1

18 pages, 43842 KB  
Article
DPO-ESRGAN: Perceptually Enhanced Super-Resolution Using Direct Preference Optimization
by Wonwoo Yun and Hanhoon Park
Electronics 2025, 14(17), 3357; https://doi.org/10.3390/electronics14173357 - 23 Aug 2025
Viewed by 1933
Abstract
Super-resolution (SR) is a long-standing task in the field of computer vision that aims to improve the quality and resolution of an image. ESRGAN is a representative generative adversarial network specialized to produce perceptually convincing SR images. However, it often fails to recover [...] Read more.
Super-resolution (SR) is a long-standing task in the field of computer vision that aims to improve the quality and resolution of an image. ESRGAN is a representative generative adversarial network specialized to produce perceptually convincing SR images. However, it often fails to recover local details and still produces blurry or unnatural visual artifacts, resulting in producing SR images that people do not prefer. To address this problem, we propose to adopt Direct Preference Optimization (DPO), which was originally devised to fine-tune large language models based on human preferences. To this end, we develop a method for applying DPO to ESRGAN, and add a DPO loss for training the ESRGAN generator. Through ×4 SR experiments utilizing benchmark datasets, it is demonstrated that the proposed method can produce SR images with a significantly higher perceptual quality and higher human preference than ESRGAN and other ESRGAN variants that have modified the loss or network structure of ESRGAN. Specifically, when compared to ESRGAN, the proposed method achieved, on average, 0.32 lower PieAPP values, 0.79 lower NIQE values, and 0.05 higher PSNR values on the BSD100 dataset, as well as 0.32 lower PieAPP values, 0.32 lower NIQE values, and 0.17 higher PSNR values on the Set14 dataset. Full article
Show Figures

Figure 1

14 pages, 7081 KB  
Article
SupGAN: A General Super-Resolution GAN-Promoting Training Method
by Tao Wu, Shuo Xiong, Qiuhang Chen, Huaizheng Liu, Weijun Cao and Haoran Tuo
Appl. Sci. 2025, 15(17), 9231; https://doi.org/10.3390/app15179231 - 22 Aug 2025
Viewed by 915
Abstract
An image super-resolution (SR) method based on Generative Adversarial Networks (GANs) has achieved impressive results in terms of visual performance. However, the weights of loss functions in these methods are usually set to fixed values manually, which cannot fully adapt to different datasets [...] Read more.
An image super-resolution (SR) method based on Generative Adversarial Networks (GANs) has achieved impressive results in terms of visual performance. However, the weights of loss functions in these methods are usually set to fixed values manually, which cannot fully adapt to different datasets and tasks, and may result in a decrease in the perceptual effect of the SR images. To address this issue and further improve visual quality, we propose a perception-driven SupGAN, which improves the generator and loss function of GAN-based image super-resolution models. The generator adopts multi-scale feature extraction and fusion to restore SR images with diverse and fine textures. We design a network-training method based on the proportion of high-frequency information in images (BHFTM), which utilizes the proportion of high-frequency information in images obtained through the Canny operator to set the weights of the loss function. In addition, we employ the four-patch method to better simulate the degradation of complex real-world scenarios. We extensively test our method and compare it with recent SR methods (BSRGAN, Real-ESRGAN, RealSR, SwinIR, LDL, etc.) on different types of datasets (OST300, 2020track1, RealWorld38, BSDS100 etc.) with a scaling factor of ×4. The results show that the NIQE metric improves, and also demonstrate that SupGAN can generate more natural and fine textures while suppressing unpleasant artifacts. Full article
(This article belongs to the Special Issue Collaborative Learning and Optimization Theory and Its Applications)
Show Figures

Figure 1

16 pages, 2750 KB  
Article
Combining Object Detection, Super-Resolution GANs and Transformers to Facilitate Tick Identification Workflow from Crowdsourced Images on the eTick Platform
by Étienne Clabaut, Jérémie Bouffard and Jade Savage
Insects 2025, 16(8), 813; https://doi.org/10.3390/insects16080813 - 6 Aug 2025
Viewed by 1009
Abstract
Ongoing changes in the distribution and abundance of several tick species of medical relevance in Canada have prompted the development of the eTick platform—an image-based crowd-sourcing public surveillance tool for Canada enabling rapid tick species identification by trained personnel, and public health guidance [...] Read more.
Ongoing changes in the distribution and abundance of several tick species of medical relevance in Canada have prompted the development of the eTick platform—an image-based crowd-sourcing public surveillance tool for Canada enabling rapid tick species identification by trained personnel, and public health guidance based on tick species and province of residence of the submitter. Considering that more than 100,000 images from over 73,500 identified records representing 25 tick species have been submitted to eTick since the public launch in 2018, a partial automation of the image processing workflow could save substantial human resources, especially as submission numbers have been steadily increasing since 2021. In this study, we evaluate an end-to-end artificial intelligence (AI) pipeline to support tick identification from eTick user-submitted images, characterized by heterogeneous quality and uncontrolled acquisition conditions. Our framework integrates (i) tick localization using a fine-tuned YOLOv7 object detection model, (ii) resolution enhancement of cropped images via super-resolution Generative Adversarial Networks (RealESRGAN and SwinIR), and (iii) image classification using deep convolutional (ResNet-50) and transformer-based (ViT) architectures across three datasets (12, 6, and 3 classes) of decreasing granularities in terms of taxonomic resolution, tick life stage, and specimen viewing angle. ViT consistently outperformed ResNet-50, especially in complex classification settings. The configuration yielding the best performance—relying on object detection without incorporating super-resolution—achieved a macro-averaged F1-score exceeding 86% in the 3-class model (Dermacentor sp., other species, bad images), with minimal critical misclassifications (0.7% of “other species” misclassified as Dermacentor). Given that Dermacentor ticks represent more than 60% of tick volume submitted on the eTick platform, the integration of a low granularity model in the processing workflow could save significant time while maintaining very high standards of identification accuracy. Our findings highlight the potential of combining modern AI methods to facilitate efficient and accurate tick image processing in community science platforms, while emphasizing the need to adapt model complexity and class resolution to task-specific constraints. Full article
(This article belongs to the Section Medical and Livestock Entomology)
Show Figures

Graphical abstract

31 pages, 8947 KB  
Article
Research on Super-Resolution Reconstruction of Coarse Aggregate Particle Images for Earth–Rock Dam Construction Based on Real-ESRGAN
by Shuangping Li, Lin Gao, Bin Zhang, Zuqiang Liu, Xin Zhang, Linjie Guan and Junxing Zheng
Sensors 2025, 25(13), 4084; https://doi.org/10.3390/s25134084 - 30 Jun 2025
Viewed by 861
Abstract
This paper investigates the super-resolution reconstruction technology of coarse granular particle images for embankment construction in earth/rock dams based on Real-ESRGAN, aiming to improve the quality of low-resolution particle images and enhance the accuracy of particle shape analysis. The paper begins with a [...] Read more.
This paper investigates the super-resolution reconstruction technology of coarse granular particle images for embankment construction in earth/rock dams based on Real-ESRGAN, aiming to improve the quality of low-resolution particle images and enhance the accuracy of particle shape analysis. The paper begins with a review of traditional image super-resolution methods, introducing Generative Adversarial Networks (GAN) and Real-ESRGAN, which effectively enhance image detail recovery through perceptual loss and adversarial training. To improve the generalization ability of the super-resolution model, the study expands the morphological database of earth/rock dam particles by employing a multi-modal data augmentation strategy, covering a variety of particle shapes. The paper utilizes a dual-stage degradation model to simulate the image degradation process in real-world environments, providing a diverse set of degraded images for training the super-resolution reconstruction model. Through wavelet transform methods, the paper analyzes the edge and texture features of particle images, further improving the precision of particle shape feature extraction. Experimental results show that Real-ESRGAN outperforms other traditional super-resolution algorithms in terms of edge clarity, detail recovery, and the preservation of morphological features of particle images, particularly under low-resolution conditions, with significant improvement in image reconstruction. In conclusion, Real-ESRGAN demonstrates excellent performance in the super-resolution reconstruction of coarse granular particle images for embankment construction in earth/rock dams. It can effectively restore the details and morphological features of particle images, providing more accurate technical support for particle shape analysis in civil engineering. Full article
Show Figures

Figure 1

14 pages, 8597 KB  
Article
AI-Based Enhancing of xBn MWIR Thermal Camera Performance at 180 Kelvin
by Michael Zadok, Zeev Zalevsky and Benjamin Milgrom
Sensors 2025, 25(10), 3200; https://doi.org/10.3390/s25103200 - 19 May 2025
Viewed by 975
Abstract
Thermal imaging technology has revolutionized various fields, but current high operating temperature (HOT) mid-wave infrared (MWIR) cameras, particularly those based on xBn detectors, face limitations in size and cost due to the need for cooling to 150 Kelvin. This study explores the potential [...] Read more.
Thermal imaging technology has revolutionized various fields, but current high operating temperature (HOT) mid-wave infrared (MWIR) cameras, particularly those based on xBn detectors, face limitations in size and cost due to the need for cooling to 150 Kelvin. This study explores the potential of extending the operating temperature of these cameras to 180 Kelvin, leveraging advanced AI algorithms to mitigate the increased thermal noise expected at higher temperatures. This research investigates the feasibility and effectiveness of this approach for remote sensing applications, combining experimental data with cutting-edge image enhancement techniques like Enhanced Super-Resolution Generative Adversarial Networks (ESRGAN). The findings demonstrate the potential of 180 Kelvin operation for xBn MWIR cameras, particularly in daylight conditions, paving the way for a new generation of more affordable and compact thermal imaging systems. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

26 pages, 20953 KB  
Article
Optimization-Based Downscaling of Satellite-Derived Isotropic Broadband Albedo to High Resolution
by Niko Lukač, Domen Mongus and Marko Bizjak
Remote Sens. 2025, 17(8), 1366; https://doi.org/10.3390/rs17081366 - 11 Apr 2025
Cited by 1 | Viewed by 806
Abstract
In this paper, a novel method for estimating high-resolution isotropic broadband albedo is proposed, by downscaling satellite-derived albedo using an optimization approach. At first, broadband albedo is calculated from the lower-resolution multispectral satellite image using standard narrow-to-broadband (NTB) conversion, where the surfaces are [...] Read more.
In this paper, a novel method for estimating high-resolution isotropic broadband albedo is proposed, by downscaling satellite-derived albedo using an optimization approach. At first, broadband albedo is calculated from the lower-resolution multispectral satellite image using standard narrow-to-broadband (NTB) conversion, where the surfaces are considered Lambertian with isotropic reflectance. The high-resolution true orthophoto for the same location is segmented with the deep learning-based Segment Anything Model (SAM), and the resulting segments are refined with a classified digital surface model (cDSM) to exclude small transient objects. Afterwards, the remaining segments are grouped using K-means clustering, by considering orthophoto-visible (VIS) and near-infrared (NIR) bands. These segments present surfaces with similar materials and underlying reflectance properties. Next, the Differential Evolution (DE) optimization algorithm is applied to approximate albedo values to these segments so that their spatial aggregate matches the coarse-resolution satellite albedo, by proposing two novel objective functions. Extensive experiments considering different DE parameters over an 0.75 km2 large urban area in Maribor, Slovenia, have been carried out, where Sentinel-2 Level-2A NTB-derived albedo was downscaled to 1 m spatial resolution. Looking at the performed spatiospectral analysis, the proposed method achieved absolute differences of 0.09 per VIS band and below 0.18 per NIR band, in comparison to lower-resolution NTB-derived albedo. Moreover, the proposed method achieved a root mean square error (RMSE) of 0.0179 and a mean absolute percentage error (MAPE) of 4.0299% against ground truth broadband albedo annotations of characteristic materials in the given urban area. The proposed method outperformed the Enhanced Super-Resolution Generative Adversarial Networks (ESRGANs), which achieved an RMSE of 0.0285 and an MAPE of 9.2778%, and the Blind Super-Resolution Generative Adversarial Network (BSRGAN), which achieved an RMSE of 0.0341 and an MAPE of 12.3104%. Full article
(This article belongs to the Section AI Remote Sensing)
Show Figures

Graphical abstract

19 pages, 2806 KB  
Article
SP-IGAN: An Improved GAN Framework for Effective Utilization of Semantic Priors in Real-World Image Super-Resolution
by Meng Wang, Zhengnan Li, Haipeng Liu, Zhaoyu Chen and Kewei Cai
Entropy 2025, 27(4), 414; https://doi.org/10.3390/e27040414 - 11 Apr 2025
Cited by 2 | Viewed by 1167
Abstract
Single-image super-resolution (SISR) based on GANs has achieved significant progress. However, these methods still face challenges when reconstructing locally consistent textures due to a lack of semantic understanding of image categories. This highlights the necessity of focusing on contextual information comprehension and the [...] Read more.
Single-image super-resolution (SISR) based on GANs has achieved significant progress. However, these methods still face challenges when reconstructing locally consistent textures due to a lack of semantic understanding of image categories. This highlights the necessity of focusing on contextual information comprehension and the acquisition of high-frequency details in model design. To address this issue, we propose the Semantic Prior-Improved GAN (SP-IGAN) framework, which incorporates additional contextual semantic information into the Real-ESRGAN model. The framework consists of two branches. The main branch introduces a Graph Convolutional Channel Attention (GCCA) module to transform channel dependencies into adjacency relationships between feature vertices, thereby enhancing pixel associations. The auxiliary branch strengthens the correlation between semantic category information and regional textures in the Residual-in-Residual Dense Block (RRDB) module. The auxiliary branch employs a pretrained segmentation model to accurately extract regional semantic information from the input low-resolution image. This information is injected into the RRDB module through Spatial Feature Transform (SFT) layers, generating more accurate and semantically consistent texture details. Additionally, a wavelet loss is incorporated into the loss function to capture high-frequency details that are often overlooked. The experimental results demonstrate that the proposed SP-IGAN outperforms state-of-the-art (SOTA) super-resolution models across multiple public datasets. For the X4 super-resolution task, SP-IGAN achieves a 0.55 dB improvement in Peak Signal-to-Noise Ratio (PSNR) and a 0.0363 increase in Structural Similarity Index (SSIM) compared to the baseline model Real-ESRGAN. Full article
Show Figures

Figure 1

14 pages, 10981 KB  
Article
Enhancement of Sentinel-2A Images for Ship Detection via Real-ESRGAN Model
by Cemre Fazilet Aldoğan, Koray Aksu and Hande Demirel
Appl. Sci. 2024, 14(24), 11988; https://doi.org/10.3390/app142411988 - 21 Dec 2024
Cited by 6 | Viewed by 3295
Abstract
Ship detection holds great value regarding port management, logistics operations, ship security, and other crucial issues concerning surveillance and safety. Recently, ship detection from optical satellite imagery has gained popularity among the research community because optical images are easily accessible with little or [...] Read more.
Ship detection holds great value regarding port management, logistics operations, ship security, and other crucial issues concerning surveillance and safety. Recently, ship detection from optical satellite imagery has gained popularity among the research community because optical images are easily accessible with little or no cost. However, these images’ quality and quantity of feature details are bound to their spatial resolution, which often comes in medium-low spatial resolution. Accurately detecting ships requires images with richer texture and resolution. Super-resolution is used to recover features in medium-low resolution images, which can help leverage accuracy in ship detection. In this regard, this paper quantitatively and visually investigates the effectiveness of super-resolution in enabling more accurate ship detection in medium spatial resolution images by comparing Sentinel-2A images and enhanced Sentinel-2A images. A collection of Sentinel-2A images was enhanced four times with a Real-ESRGAN model that trained PlanetScope images with high spatial resolution. Separate ship detections with YOLOv10 were implemented for Sentinel-2A images and enhanced Sentinel-2A images. The visual and metric results of both detections were compared to demonstrate the contributory effect of enhancement on the ships’ detection accuracy. Ship detection on enhanced Sentinel-2A images has a mAP50 and mAP50-95 value of 87.5% and 68.5%. These results outperformed the training process on Sentinel-2A images with a mAP value increase of 2.6% for both mAP50 and mAP50-95, demonstrating the positive contribution of super-resolution. Full article
Show Figures

Figure 1

26 pages, 4835 KB  
Article
Optimization of Imaging Reconnaissance Systems Using Super-Resolution: Efficiency Analysis in Interference Conditions
by Marta Bistroń and Zbigniew Piotrowski
Sensors 2024, 24(24), 7977; https://doi.org/10.3390/s24247977 - 13 Dec 2024
Cited by 4 | Viewed by 1795
Abstract
Image reconnaissance systems are critical in modern applications, where the ability to accurately detect and identify objects is crucial. However, distortions in real-world operational conditions, such as motion blur, noise, and compression artifacts, often degrade image quality, affecting the performance of detection systems. [...] Read more.
Image reconnaissance systems are critical in modern applications, where the ability to accurately detect and identify objects is crucial. However, distortions in real-world operational conditions, such as motion blur, noise, and compression artifacts, often degrade image quality, affecting the performance of detection systems. This study analyzed the impact of super-resolution (SR) technology, in particular, the Real-ESRGAN model, on the performance of a detection model under disturbed conditions. The methodology involved training and evaluating the Faster R-CNN detection model with original and modified data sets. The results showed that SR significantly improved detection precision and mAP in most interference scenarios. These findings underscore SR’s potential to improve imaging systems while identifying key areas for future development and further research. Full article
(This article belongs to the Special Issue Sensors and Machine-Learning Based Signal Processing)
Show Figures

Figure 1

13 pages, 4124 KB  
Article
Intelligent Detection Method for Surface Defects of Particleboard Based on Super-Resolution Reconstruction
by Haiyan Zhou, Haifei Xia, Chenlong Fan, Tianxiang Lan, Ying Liu, Yutu Yang, Yinxi Shen and Wei Yu
Forests 2024, 15(12), 2196; https://doi.org/10.3390/f15122196 - 13 Dec 2024
Cited by 8 | Viewed by 1731
Abstract
To improve the intelligence level of particleboard inspection lines, machine vision and artificial intelligence technologies are combined to replace manual inspection with automatic detection. Aiming at the problem of missed detection and false detection on small defects due to the large surface width, [...] Read more.
To improve the intelligence level of particleboard inspection lines, machine vision and artificial intelligence technologies are combined to replace manual inspection with automatic detection. Aiming at the problem of missed detection and false detection on small defects due to the large surface width, complex texture and different surface defect shapes of particleboard, this paper introduces image super-resolution technology and proposes a super-resolution reconstruction model for particleboard images. Based on the Transformer network, this model incorporates an improved SRResNet (Super-Resolution Residual Network) backbone network in the deep feature extraction module to extract deep texture information. The shallow features extracted by conv 3 × 3 are then fused with features extracted by the Transformer, considering both local texture features and global feature information. This enhances image quality and makes defect details clearer. Through comparison with the traditional bicubic B-spline interpolation method, ESRGAN (Enhanced Super-Resolution Generative Adversarial Network), and SwinIR (Image Restoration Using Swin Transformer), the effectiveness of the particleboard super-resolution reconstruction model is verified using objective evaluation metrics including PSNR, SSIM, and LPIPS, demonstrating its ability to produce higher-quality images with more details and better visual characteristics. Finally, using the YOLOv8 model to compare defect detection rates between super-resolution images and low-resolution images, the mAP can reach 96.5%, which is 25.6% higher than the low-resolution image recognition rate. Full article
(This article belongs to the Section Wood Science and Forest Products)
Show Figures

Figure 1

18 pages, 2655 KB  
Article
Advanced Image Preprocessing and Integrated Modeling for UAV Plant Image Classification
by Girma Tariku, Isabella Ghiglieno, Anna Simonetto, Fulvio Gentilin, Stefano Armiraglio, Gianni Gilioli and Ivan Serina
Drones 2024, 8(11), 645; https://doi.org/10.3390/drones8110645 - 6 Nov 2024
Cited by 6 | Viewed by 3140
Abstract
The automatic identification of plant species using unmanned aerial vehicles (UAVs) is a valuable tool for ecological research. However, challenges such as reduced spatial resolution due to high-altitude operations, image degradation from camera optics and sensor limitations, and information loss caused by terrain [...] Read more.
The automatic identification of plant species using unmanned aerial vehicles (UAVs) is a valuable tool for ecological research. However, challenges such as reduced spatial resolution due to high-altitude operations, image degradation from camera optics and sensor limitations, and information loss caused by terrain shadows hinder the accurate classification of plant species from UAV imagery. This study addresses these issues by proposing a novel image preprocessing pipeline and evaluating its impact on model performance. Our approach improves image quality through a multi-step pipeline that includes Enhanced Super-Resolution Generative Adversarial Networks (ESRGAN) for resolution enhancement, Contrast-Limited Adaptive Histogram Equalization (CLAHE) for contrast improvement, and white balance adjustments for accurate color representation. These preprocessing steps ensure high-quality input data, leading to better model performance. For feature extraction and classification, we employ a pre-trained VGG-16 deep convolutional neural network, followed by machine learning classifiers, including Support Vector Machine (SVM), random forest (RF), and Extreme Gradient Boosting (XGBoost). This hybrid approach, combining deep learning for feature extraction with machine learning for classification, not only enhances classification accuracy but also reduces computational resource requirements compared to relying solely on deep learning models. Notably, the VGG-16 + SVM model achieved an outstanding accuracy of 97.88% on a dataset preprocessed with ESRGAN and white balance adjustments, with a precision of 97.9%, a recall of 97.8%, and an F1 score of 0.978. Through a comprehensive comparative study, we demonstrate that the proposed framework, utilizing VGG-16 for feature extraction, SVM for classification, and preprocessed images with ESRGAN and white balance adjustments, achieves superior performance in plant species identification from UAV imagery. Full article
(This article belongs to the Section Drones in Ecology)
Show Figures

Figure 1

Back to TopTop