Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (68)

Search Parameters:
Keywords = face image super-resolution

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 1452 KB  
Article
An Efficient Pedestrian Gender Recognition Method Based on Key Area Feature Extraction and Information Fusion
by Ye Zhang, Weidong Yan, Guoqi Liu, Ning Jin and Lu Han
Appl. Sci. 2026, 16(3), 1298; https://doi.org/10.3390/app16031298 - 27 Jan 2026
Abstract
Aiming to address the problems of scale uncertainty, feature extraction difficulty, model training difficulty, poor real-time performance, and sample imbalance in low-resolution images for gender recognition, this study proposes an efficient pedestrian gender recognition model based on key area feature extraction and fusion. [...] Read more.
Aiming to address the problems of scale uncertainty, feature extraction difficulty, model training difficulty, poor real-time performance, and sample imbalance in low-resolution images for gender recognition, this study proposes an efficient pedestrian gender recognition model based on key area feature extraction and fusion. First, a discrete cosine transform (DCT)-based local super-resolution preprocessing algorithm is developed for facial image gender recognition. Then, a key area feature extraction and information fusion model is designed, using additional appearance features to assist in gender recognition and improve accuracy. The proposed model preprocesses images using the DCT image fusion and super-resolution methods, dividing pedestrian images into three regions: face, hair, and lower body (legs) regions. Features are separately extracted from each of the three image regions. Finally, a multi-region local gender recognition classifier is designed and trained, employing decision-level information fusion. The results of the three local classifiers are fused using a Bayesian computation-based fusion strategy to obtain the final recognition result of a pedestrian’s gender. This study uses surveillance video data to create a dataset for experimental comparison. Experimental results demonstrate the superiority of the proposed approach. The facial model (DCT-PFSR-CNN) achieved the best accuracy of 89% and an F1-Score of 0.88. Furthermore, the complete pedestrian model (MPGRM) attained an mAP of 0.85 and an AUC of 0.86, surpassing the strongest baseline (HDFL) by 2.4% in mAP and 2.3% in AUC. These results confirm the high application potential of the proposed method for gender recognition in real-world surveillance scenarios. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
21 pages, 4544 KB  
Article
Small Ship Detection Based on a Learning Model That Incorporates Spatial Attention Mechanism as a Loss Function in SU-ESRGAN
by Kohei Arai, Yu Morita and Hiroshi Okumura
Remote Sens. 2026, 18(3), 417; https://doi.org/10.3390/rs18030417 - 27 Jan 2026
Abstract
Ship monitoring using Synthetic Aperture Radar (SAR) data faces significant challenges in detecting small vessels due to low spatial resolution and speckle noise. While ESRGAN (Enhanced Super-Resolution Generative Adversarial Network) has shown promise for image super-resolution, it struggles with SAR imagery characteristics. This [...] Read more.
Ship monitoring using Synthetic Aperture Radar (SAR) data faces significant challenges in detecting small vessels due to low spatial resolution and speckle noise. While ESRGAN (Enhanced Super-Resolution Generative Adversarial Network) has shown promise for image super-resolution, it struggles with SAR imagery characteristics. This study proposes SA/SU-ESRGAN, which extends the SU-ESRGAN framework by incorporating a spatial attention mechanism loss function. SU-ESRGAN introduced semantic structural loss to accurately preserve ship shapes and contours; our enhancement adds spatial attention to focus reconstruction efforts on ship regions while suppressing background noise. Experimental results demonstrate that SA/SU-ESRGAN successfully detects small vessels that remain undetectable by SU-ESRGAN, achieving improved detection capabilities with a PSNR of approximately 26 dB (SSIM is around 0.5) and enhanced visual clarity in ship boundaries. The spatial attention mechanism effectively reduces noise influence, producing clearer super-resolution results suitable for maritime surveillance applications. Based on the HRSID dataset, a representative dataset for evaluating ship detection performance using SAR data, we evaluated ship detection performance using images in which the spatial resolution of the SAR data was artificially degraded using a smoothing filter. We found that with a 4 × 4 filter, all eight ships were detected without any problems, but with an 8 × 8 filter, only three of the eight ships were detected. When super-resolution was applied to this, six ships were detected. Full article
(This article belongs to the Special Issue Applications of SAR for Environment Observation Analysis)
Show Figures

Figure 1

18 pages, 10421 KB  
Article
A Deep Learning Framework with Multi-Scale Texture Enhancement and Heatmap Fusion for Face Super Resolution
by Bing Xu, Lei Wang, Yanxia Wu, Xiaoming Liu and Lu Gan
AI 2026, 7(1), 20; https://doi.org/10.3390/ai7010020 - 9 Jan 2026
Viewed by 326
Abstract
Face super-resolution (FSR) has made great progress thanks to deep learning and facial priors. However, many existing methods do not fully exploit landmark heatmaps and lack effective multi-scale texture modeling, which often leads to texture loss and artifacts under large upscaling factors. To [...] Read more.
Face super-resolution (FSR) has made great progress thanks to deep learning and facial priors. However, many existing methods do not fully exploit landmark heatmaps and lack effective multi-scale texture modeling, which often leads to texture loss and artifacts under large upscaling factors. To address these problems, we propose a Multi-Scale Residual Stacking Network (MRSNet), which integrates multi-scale texture enhancement with multi-stage heatmap fusion. The MRSNet is built upon Residual Attention-Guided Units (RAGUs) and incorporates a Face Detail Enhancer (FDE), which applies edge, texture, and region branches to achieve differentiated enhancement across facial components. Furthermore, we design a Multi-Scale Texture Enhancement Module (MTEM) that employs progressive average pooling to construct hierarchical receptive fields and employs heatmap-guided attention for adaptive texture refinement. In addition, we introduce a multi-stage heatmap fusion strategy that injects landmark priors into multiple phases of the network, including feature extraction, texture enhancement, and detail reconstruction, enabling deep sharing and progressive integration of prior knowledge. Extensive experiments on CelebA and Helen demonstrate that the proposed method achieves superior detail recovery and generates perceptually realistic high-resolution face images. Both quantitative and qualitative evaluations confirm that our approach outperforms state-of-the-art methods. Full article
Show Figures

Figure 1

40 pages, 12777 KB  
Systematic Review
A Systematic Review of Diffusion Models for Medical Image-Based Diagnosis: Methods, Taxonomies, Clinical Integration, Explainability, and Future Directions
by Mohammad Azad, Nur Mohammad Fahad, Mohaimenul Azam Khan Raiaan, Tanvir Rahman Anik, Md Faraz Kabir Khan, Habib Mahamadou Kélé Toyé and Ghulam Muhammad
Diagnostics 2026, 16(2), 211; https://doi.org/10.3390/diagnostics16020211 - 9 Jan 2026
Viewed by 518
Abstract
Background and Objectives: Diffusion models, as a recent advancement in generative modeling, have become central to high-resolution image synthesis and reconstruction. Their rapid progress has notably shaped computer vision and health informatics, particularly by enhancing medical imaging and diagnostic workflows. However, despite these [...] Read more.
Background and Objectives: Diffusion models, as a recent advancement in generative modeling, have become central to high-resolution image synthesis and reconstruction. Their rapid progress has notably shaped computer vision and health informatics, particularly by enhancing medical imaging and diagnostic workflows. However, despite these developments, researchers continue to face challenges due to the absence of a structured and comprehensive discussion on the use of diffusion models within clinical imaging. Methods: This systematic review investigates the application of diffusion models in medical imaging for diagnostic purposes. It provides an integrated overview of their underlying principles, major application areas, and existing research limitations. The review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines and included peer-reviewed studies published between 2013 and 2024. Studies were eligible if they employed diffusion models for diagnostic tasks in medical imaging; non-medical studies and those not involving diffusion-based methods were excluded. Searches were conducted across major scientific databases prior to the review. Risk of bias was assessed based on methodological rigor and reporting quality. Given the heterogeneity of study designs, a narrative synthesis approach was used. Results: A total of 68 studies met the inclusion criteria, spanning multiple imaging modalities and falling into eight major application categories: anomaly detection, classification, denoising, generation, reconstruction, segmentation, super-resolution, and image-to-image translation. Explainable AI components were present in 22.06% of the studies, clinician engagement in 57.35%, and real-time implementation in 10.30%. Overall, the findings highlight the strong diagnostic potential of diffusion models but also emphasize the variability in reporting standards, methodological inconsistencies, and the limited validation in real-world clinical settings. Conclusions: Diffusion models offer significant promise for diagnostic imaging, yet their reliable clinical deployment requires advances in explainability, clinician integration, and real-time performance. This review identifies twelve key research directions that can guide future developments and support the translation of diffusion-based approaches into routine medical practice. Full article
Show Figures

Figure 1

21 pages, 5246 KB  
Article
Improving Face Image Transmission with LoRa Using a Generative Adversarial Network
by Bilal Babayiğit and Fatma Yarlı Doğan
Appl. Sci. 2025, 15(21), 11767; https://doi.org/10.3390/app152111767 - 4 Nov 2025
Cited by 1 | Viewed by 1283
Abstract
Although it is a technology that can be pretty important for remote areas lacking internet or cellular data, the difficulties it presents in large data transmission prevent LoRa from developing sufficiently for image transmission. This challenge is particularly relevant for applications requiring the [...] Read more.
Although it is a technology that can be pretty important for remote areas lacking internet or cellular data, the difficulties it presents in large data transmission prevent LoRa from developing sufficiently for image transmission. This challenge is particularly relevant for applications requiring the transfer of facial images, such as remote security or identification. It is possible to overcome these difficulties by reducing the data size through the application of various image processing methods. In the study, the face-focused enhanced super-resolution generative adversarial network (ESRGAN) is trained to address the significant quality loss in low-resolution face images transmitted to the receiver as a result of image processing techniques. Also, the trained ESRGAN model is evaluated comparatively with the Real-ESRGAN model and a standard bicubic interpolation baseline. In addition to Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) metrics, Learned Perceptual Image Patch Similarity (LPIPS) for perceptual quality and a facial identity preservation metric are used to calculate the similarities of the produced super-resolution (SR) images to the original images. The study was tested in practice, demonstrating that a facial image transmitted in 42 min via LoRa can be transmitted in 5 s using image processing techniques and that the images can be improved close to the real images at the receiver. Thus, with an integrated system that enhances the transmitted visual data, it becomes possible to transmit compressed, low-resolution image data using LoRa. The study aims to contribute to remote security or identification studies in regions with difficult internet and cellular data transmission by making significant improvements in image transmission with LoRa. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

17 pages, 3049 KB  
Article
PECNet: A Lightweight Single-Image Super-Resolution Network with Periodic Boundary Padding Shift and Multi-Scale Adaptive Feature Aggregation
by Tianyu Gao and Yuhao Liu
Symmetry 2025, 17(11), 1833; https://doi.org/10.3390/sym17111833 - 1 Nov 2025
Viewed by 554
Abstract
Lightweight Single-Image Super-Resolution (SISR) faces the core challenge of balancing computational efficiency with reconstruction quality, particularly in preserving both high-frequency details and global structures under constrained resources. To address this, we propose the Periodically Enhanced Cascade Network (PECNet). Our main contributions are as [...] Read more.
Lightweight Single-Image Super-Resolution (SISR) faces the core challenge of balancing computational efficiency with reconstruction quality, particularly in preserving both high-frequency details and global structures under constrained resources. To address this, we propose the Periodically Enhanced Cascade Network (PECNet). Our main contributions are as follows: 1. Its core component, a novel Multi-scale Adaptive Feature Aggregation (MAFA) module, which employs three functionally complementary branches that work synergistically: one dedicated to extracting local high-frequency details, another to efficiently modeling long-range dependencies and a third to capturing structured contextual information within windows. 2. To seamlessly integrate these branches and enable cross-window information interaction, we introduce the Periodic Boundary Padding Shift (PBPS) mechanism. This mechanism serves as a symmetric preprocessing step that achieves implicit window shifting without introducing any additional computational overhead. Extensive benchmarking shows PECNet achieves better reconstruction quality without a complexity increase. Taking the representative shift-window-based lightweight model, NGswin, as an example, for ×4 SR on the Manga109 dataset, PECNet achieves an average PSNR 0.25 dB higher, while its computational cost (in FLOPs) constitutes merely 40% of NGswin’s. Full article
Show Figures

Figure 1

16 pages, 6539 KB  
Article
A High-Precision Ionospheric Channel Estimation Method Based on Oblique Projection and Double-Space Decomposition
by Zhengkai Wei, Baiyang Guo, Zhihui Li and Qingsong Zhou
Sensors 2025, 25(18), 5727; https://doi.org/10.3390/s25185727 - 14 Sep 2025
Viewed by 1112
Abstract
Accurate ionospheric channel estimation is of great significance for acquisition of ionospheric structure, error correction of remote sensing data, high-precision Synthetic Aperture Radar (SAR) imaging, over-the-horizon (OTH) detection, and the establishment of stable communication links. Traditional super-resolution channel estimation algorithms face challenges in [...] Read more.
Accurate ionospheric channel estimation is of great significance for acquisition of ionospheric structure, error correction of remote sensing data, high-precision Synthetic Aperture Radar (SAR) imaging, over-the-horizon (OTH) detection, and the establishment of stable communication links. Traditional super-resolution channel estimation algorithms face challenges in terms of multipath correlation and noise interference when estimating ionospheric channel information. Meanwhile, some super-resolution algorithms struggle to meet the requirements of real-time measurement due to their high computational complexity. In this paper, we propose the Cross-correlation Oblique Projection Pursuit (CC-OPMP) algorithm, which constructs an atom selection strategy for anti-interference correlation metric and a dual-space multipath separation mechanism based on a greedy framework to effectively suppress noise and separate neighboring multipath components. Simulations demonstrate that the CC-OPMP algorithm outperforms other algorithms in both channel estimation accuracy and computational efficiency. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

28 pages, 19672 KB  
Article
A Multi-Fidelity Data Fusion Approach Based on Semi-Supervised Learning for Image Super-Resolution in Data-Scarce Scenarios
by Hongzheng Zhu, Yingjuan Zhao, Ximing Qiao, Jinshuo Zhang, Jingnan Ma and Sheng Tong
Sensors 2025, 25(17), 5373; https://doi.org/10.3390/s25175373 - 31 Aug 2025
Viewed by 1036
Abstract
Image super-resolution (SR) techniques can significantly enhance visual quality and information density. However, existing methods often rely on large amounts of paired low- and high-resolution (LR-HR) data, which limits their generalization and robustness when faced with data scarcity, distribution inconsistencies, and missing high-frequency [...] Read more.
Image super-resolution (SR) techniques can significantly enhance visual quality and information density. However, existing methods often rely on large amounts of paired low- and high-resolution (LR-HR) data, which limits their generalization and robustness when faced with data scarcity, distribution inconsistencies, and missing high-frequency details. To tackle the challenges of image reconstruction in data-scarce scenarios, this paper proposes a semi-supervised learning-driven multi-fidelity fusion (SSLMF) method, which integrates multi-fidelity data fusion (MFDF) and semi-supervised learning (SSL) to reduce reliance on high-fidelity data. More specifically, (1) an MFDF strategy is employed to leverage low-fidelity data for global structural constraints, enhancing information compensation; (2) an SSL mechanism is introduced to reduce data dependence by using only a small amount of labeled HR samples along with a large quantity of unlabeled multi-fidelity data. This framework significantly improves data efficiency and reconstruction quality. We first validate the reconstruction accuracy of SSLMF on benchmark functions and then apply it to image reconstruction tasks. The results demonstrate that SSLMF can effectively model both linear and nonlinear relationships among multi-fidelity data, maintaining high performance even with limited high-fidelity samples. Finally, its cross-disciplinary potential is illustrated through an audio restoration case study, offering a novel solution for efficient image reconstruction, especially in data-scarce scenarios where high-fidelity samples are limited. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

26 pages, 5964 KB  
Article
Super-Resolution Reconstruction of Part Images Using Adaptive Multi-Scale Object Tracking
by Yaohe Li, Long Jin, Yindi Bai, Zhiwen Song and Dongyuan Ge
Processes 2025, 13(8), 2563; https://doi.org/10.3390/pr13082563 - 14 Aug 2025
Viewed by 810
Abstract
Computer vision-based part surface inspection is widely used for quality evaluation. However, challenges such as low image quality, caused by factors like inadequate acquisition equipment, camera vibrations, and environmental conditions, often lead to reduced detection accuracy. Although super-resolution reconstruction can enhance image quality, [...] Read more.
Computer vision-based part surface inspection is widely used for quality evaluation. However, challenges such as low image quality, caused by factors like inadequate acquisition equipment, camera vibrations, and environmental conditions, often lead to reduced detection accuracy. Although super-resolution reconstruction can enhance image quality, existing methods face issues such as limited accuracy, information distortion, and high computational cost. To overcome these challenges, we propose a novel super-resolution reconstruction method for part images that incorporates adaptive multi-scale object tracking. Our approach first adaptively segments the input sequence of part images into blocks of varying scales, improving both reconstruction accuracy and computational efficiency. Optical flow is then applied to estimate the motion parameters between sequence images, followed by the construction of a feature tracking and sampling model to extract detailed features from all images, addressing information distortion caused by pixel misalignment. Finally, a non-linear reconstruction algorithm is employed to generate the high-resolution target image. Experimental results demonstrate that our method achieves superior performance in terms of both quantitative metrics and visual quality, outperforming existing methods. This contributes to a significant improvement in subsequent part detection accuracy and production efficiency. Full article
(This article belongs to the Section Manufacturing Processes and Systems)
Show Figures

Figure 1

27 pages, 8755 KB  
Article
Mapping Wetlands with High-Resolution Planet SuperDove Satellite Imagery: An Assessment of Machine Learning Models Across the Diverse Waterscapes of New Zealand
by Md. Saiful Islam Khan, Maria C. Vega-Corredor and Matthew D. Wilson
Remote Sens. 2025, 17(15), 2626; https://doi.org/10.3390/rs17152626 - 29 Jul 2025
Cited by 1 | Viewed by 2276
Abstract
(1) Background: Wetlands are ecologically significant ecosystems that support biodiversity and contribute to essential environmental functions such as water purification, carbon storage and flood regulation. However, these ecosystems face increasing pressures from land-use change and degradation, prompting the need for scalable and accurate [...] Read more.
(1) Background: Wetlands are ecologically significant ecosystems that support biodiversity and contribute to essential environmental functions such as water purification, carbon storage and flood regulation. However, these ecosystems face increasing pressures from land-use change and degradation, prompting the need for scalable and accurate classification methods to support conservation and policy efforts. In this research, our motivation was to test whether high-spatial-resolution PlanetScope imagery can be used with pixel-based machine learning to support the mapping and monitoring of wetlands at a national scale. (2) Methods: This study compared four machine learning classification models—Random Forest (RF), XGBoost (XGB), Histogram-Based Gradient Boosting (HGB) and a Multi-Layer Perceptron Classifier (MLPC)—to detect and map wetland areas across New Zealand. All models were trained using eight-band SuperDove satellite imagery from PlanetScope, with a spatial resolution of ~3 m, and ancillary geospatial datasets representing topography and soil drainage characteristics, each of which is available globally. (3) Results: All four machine learning models performed well in detecting wetlands from SuperDove imagery and environmental covariates, with varying strengths. The highest accuracy was achieved using all eight image bands alongside features created from supporting geospatial data. For binary wetland classification, the highest F1 scores were recorded by XGB (0.73) and RF/HGB (both 0.72) when including all covariates. MLPC also showed competitive performance (wetland F1 score of 0.71), despite its relatively lower spatial consistency. However, each model over-predicts total wetland area at a national level, an issue which was able to be reduced by increasing the classification probability threshold and spatial filtering. (4) Conclusions: The comparative analysis highlights the strengths and trade-offs of RF, XGB, HGB and MLPC models for wetland classification. While all four methods are viable, RF offers some key advantages, including ease of deployment and transferability, positioning it as a promising candidate for scalable, high-resolution wetland monitoring across diverse ecological settings. Further work is required for verification of small-scale wetlands (<~0.5 ha) and the addition of fine-spatial-scale covariates. Full article
Show Figures

Figure 1

28 pages, 3794 KB  
Article
A Robust System for Super-Resolution Imaging in Remote Sensing via Attention-Based Residual Learning
by Rogelio Reyes-Reyes, Yeredith G. Mora-Martinez, Beatriz P. Garcia-Salgado, Volodymyr Ponomaryov, Jose A. Almaraz-Damian, Clara Cruz-Ramos and Sergiy Sadovnychiy
Mathematics 2025, 13(15), 2400; https://doi.org/10.3390/math13152400 - 25 Jul 2025
Viewed by 1556
Abstract
Deep learning-based super-resolution (SR) frameworks are widely used in remote sensing applications. However, existing SR models still face limitations, particularly in recovering contours, fine features, and textures, as well as in effectively integrating channel information. To address these challenges, this study introduces a [...] Read more.
Deep learning-based super-resolution (SR) frameworks are widely used in remote sensing applications. However, existing SR models still face limitations, particularly in recovering contours, fine features, and textures, as well as in effectively integrating channel information. To address these challenges, this study introduces a novel residual model named OARN (Optimized Attention Residual Network) specifically designed to enhance the visual quality of low-resolution images. The network operates on the Y channel of the YCbCr color space and integrates LKA (Large Kernel Attention) and OCM (Optimized Convolutional Module) blocks. These components can restore large-scale spatial relationships and refine textures and contours, improving feature reconstruction without significantly increasing computational complexity. The performance of OARN was evaluated using satellite images from WorldView-2, GaoFen-2, and Microsoft Virtual Earth. Evaluation was conducted using objective quality metrics, such as Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM), Edge Preservation Index (EPI), and Perceptual Image Patch Similarity (LPIPS), demonstrating superior results compared to state-of-the-art methods in both objective measurements and subjective visual perception. Moreover, OARN achieves this performance while maintaining computational efficiency, offering a balanced trade-off between processing time and reconstruction quality. Full article
Show Figures

Figure 1

21 pages, 4388 KB  
Article
An Omni-Dimensional Dynamic Convolutional Network for Single-Image Super-Resolution Tasks
by Xi Chen, Ziang Wu, Weiping Zhang, Tingting Bi and Chunwei Tian
Mathematics 2025, 13(15), 2388; https://doi.org/10.3390/math13152388 - 25 Jul 2025
Cited by 2 | Viewed by 1434
Abstract
The goal of single-image super-resolution (SISR) tasks is to generate high-definition images from low-quality inputs, with practical uses spanning healthcare diagnostics, aerial imaging, and surveillance systems. Although cnns have considerably improved image reconstruction quality, existing methods still face limitations, including inadequate restoration of [...] Read more.
The goal of single-image super-resolution (SISR) tasks is to generate high-definition images from low-quality inputs, with practical uses spanning healthcare diagnostics, aerial imaging, and surveillance systems. Although cnns have considerably improved image reconstruction quality, existing methods still face limitations, including inadequate restoration of high-frequency details, high computational complexity, and insufficient adaptability to complex scenes. To address these challenges, we propose an Omni-dimensional Dynamic Convolutional Network (ODConvNet) tailored for SISR tasks. Specifically, ODConvNet comprises four key components: a Feature Extraction Block (FEB) that captures low-level spatial features; an Omni-dimensional Dynamic Convolution Block (DCB), which utilizes a multidimensional attention mechanism to dynamically reweight convolution kernels across spatial, channel, and kernel dimensions, thereby enhancing feature expressiveness and context modeling; a Deep Feature Extraction Block (DFEB) that stacks multiple convolutional layers with residual connections to progressively extract and fuse high-level features; and a Reconstruction Block (RB) that employs subpixel convolution to upscale features and refine the final HR output. This mechanism significantly enhances feature extraction and effectively captures rich contextual information. Additionally, we employ an improved residual network structure combined with a refined Charbonnier loss function to alleviate gradient vanishing and exploding to enhance the robustness of model training. Extensive experiments conducted on widely used benchmark datasets, including DIV2K, Set5, Set14, B100, and Urban100, demonstrate that, compared with existing deep learning-based SR methods, our ODConvNet method improves Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM), and the visual quality of SR images is also improved. Ablation studies further validate the effectiveness and contribution of each component in our network. The proposed ODConvNet offers an effective, flexible, and efficient solution for the SISR task and provides promising directions for future research. Full article
Show Figures

Figure 1

23 pages, 4237 KB  
Article
Debris-Flow Erosion Volume Estimation Using a Single High-Resolution Optical Satellite Image
by Peng Zhang, Shang Wang, Guangyao Zhou, Yueze Zheng, Kexin Li and Luyan Ji
Remote Sens. 2025, 17(14), 2413; https://doi.org/10.3390/rs17142413 - 12 Jul 2025
Cited by 1 | Viewed by 1101
Abstract
Debris flows pose significant risks to mountainous regions, and quick, accurate volume estimation is crucial for hazard assessment and post-disaster response. Traditional volume estimation methods, such as ground surveys and aerial photogrammetry, are often limited by cost, accessibility, and timeliness. While remote sensing [...] Read more.
Debris flows pose significant risks to mountainous regions, and quick, accurate volume estimation is crucial for hazard assessment and post-disaster response. Traditional volume estimation methods, such as ground surveys and aerial photogrammetry, are often limited by cost, accessibility, and timeliness. While remote sensing offers wide coverage, existing optical and Synthetic Aperture Radar (SAR)-based techniques face challenges in direct volume estimation due to resolution constraints and rapid terrain changes. This study proposes a Super-Resolution Shape from Shading (SRSFS) approach enhanced by a Non-local Piecewise-smooth albedo Constraint (NPC), hereafter referred to as NPC SRSFS, to estimate debris-flow erosion volume using single high-resolution optical satellite imagery. By integrating publicly available global Digital Elevation Model (DEM) data as prior terrain reference, the method enables accurate post-disaster topography reconstruction from a single optical image, thereby reducing reliance on stereo imagery. The NPC constraint improves the robustness of albedo estimation under heterogeneous surface conditions, enhancing depth recovery accuracy. The methodology is evaluated using Gaofen-6 satellite imagery, with quantitative comparisons to aerial Light Detection and Ranging (LiDAR) data. Results show that the proposed method achieves reliable terrain reconstruction and erosion volume estimates, with accuracy comparable to airborne LiDAR. This study demonstrates the potential of NPC SRSFS as a rapid, cost-effective alternative for post-disaster debris-flow assessment. Full article
(This article belongs to the Section Remote Sensing in Geology, Geomorphology and Hydrology)
Show Figures

Figure 1

17 pages, 7786 KB  
Article
Video Coding Based on Ladder Subband Recovery and ResGroup Module
by Libo Wei, Aolin Zhang, Lei Liu, Jun Wang and Shuai Wang
Entropy 2025, 27(7), 734; https://doi.org/10.3390/e27070734 - 8 Jul 2025
Viewed by 670
Abstract
With the rapid development of video encoding technology in the field of computer vision, the demand for tasks such as video frame reconstruction, denoising, and super-resolution has been continuously increasing. However, traditional video encoding methods typically focus on extracting spatial or temporal domain [...] Read more.
With the rapid development of video encoding technology in the field of computer vision, the demand for tasks such as video frame reconstruction, denoising, and super-resolution has been continuously increasing. However, traditional video encoding methods typically focus on extracting spatial or temporal domain information, often facing challenges of insufficient accuracy and information loss when reconstructing high-frequency details, edges, and textures of images. To address this issue, this paper proposes an innovative LadderConv framework, which combines discrete wavelet transform (DWT) with spatial and channel attention mechanisms. By progressively recovering wavelet subbands, it effectively enhances the video frame encoding quality. Specifically, the LadderConv framework adopts a stepwise recovery approach for wavelet subbands, first processing high-frequency detail subbands with relatively less information, then enhancing the interaction between these subbands, and ultimately synthesizing a high-quality reconstructed image through inverse wavelet transform. Moreover, the framework introduces spatial and channel attention mechanisms, which further strengthen the focus on key regions and channel features, leading to notable improvements in detail restoration and image reconstruction accuracy. To optimize the performance of the LadderConv framework, particularly in detail recovery and high-frequency information extraction tasks, this paper designs an innovative ResGroup module. By using multi-layer convolution operations along with feature map compression and recovery, the ResGroup module enhances the network’s expressive capability and effectively reduces computational complexity. The ResGroup module captures multi-level features from low level to high level and retains rich feature information through residual connections, thus improving the overall reconstruction performance of the model. In experiments, the combination of the LadderConv framework and the ResGroup module demonstrates superior performance in video frame reconstruction tasks, particularly in recovering high-frequency information, image clarity, and detail representation. Full article
(This article belongs to the Special Issue Rethinking Representation Learning in the Age of Large Models)
Show Figures

Figure 1

23 pages, 14051 KB  
Article
A Novel Method for Water Surface Debris Detection Based on YOLOV8 with Polarization Interference Suppression
by Yi Chen, Honghui Lin, Lin Xiao, Maolin Zhang and Pingjun Zhang
Photonics 2025, 12(6), 620; https://doi.org/10.3390/photonics12060620 - 18 Jun 2025
Cited by 1 | Viewed by 1366
Abstract
Aquatic floating debris detection is a key technological foundation for ecological monitoring and integrated water environment management. It holds substantial scientific and practical value in applications such as pollution source tracing, floating debris control, and maritime navigation safety. However, this field faces ongoing [...] Read more.
Aquatic floating debris detection is a key technological foundation for ecological monitoring and integrated water environment management. It holds substantial scientific and practical value in applications such as pollution source tracing, floating debris control, and maritime navigation safety. However, this field faces ongoing challenges due to water surface polarization. Reflections of polarized light produce intense glare, resulting in localized overexposure, detail loss, and geometric distortion in captured images. These optical artifacts severely impair the performance of conventional detection algorithms, increasing both false positives and missed detections. To overcome these imaging challenges in complex aquatic environments, we propose a novel YOLOv8-based detection framework with integrated polarized light suppression mechanisms. The framework consists of four key components: a fisheye distortion correction module, a polarization feature processing layer, a customized residual network with Squeeze-and-Excitation (SE) attention, and a cascaded pipeline for super-resolution reconstruction and deblurring. Additionally, we developed the PSF-IMG dataset (Polarized Surface Floats), which includes common floating debris types such as plastic bottles, bags, and foam boards. Extensive experiments demonstrate the network’s robustness in suppressing polarization artifacts and enhancing feature stability under dynamic optical conditions. Full article
(This article belongs to the Special Issue Advancements in Optical Measurement Techniques and Applications)
Show Figures

Figure 1

Back to TopTop