MDPI - Publisher of Open Access Journals

22 pages, 34603 KiB

Open AccessArticle

A Real-Time High-Resolution Multi-Focus Image Fusion Algorithm Based on Multi-Scale Feature Aggregation

by Huawei Chen, Xingkai Du, Hongchuan Huang and Tingyu Zhao

Appl. Sci. 2025, 15(13), 6967; https://doi.org/10.3390/app15136967 - 20 Jun 2025

Viewed by 287

In microscopic imaging, the key to obtaining a fully clear image lies in effectively extracting and fusing the sharp regions from different focal planes. However, traditional multi-focus image fusion algorithms have high computational complexity, making it difficult to achieve real-time processing on embedded [...] Read more.

In microscopic imaging, the key to obtaining a fully clear image lies in effectively extracting and fusing the sharp regions from different focal planes. However, traditional multi-focus image fusion algorithms have high computational complexity, making it difficult to achieve real-time processing on embedded devices. We propose an efficient high-resolution real-time multi-focus image fusion algorithm based on multi-aggregation. we use a difference of Gaussians image and a Laplacian pyramid for focused region detection. Additionally, the image is down-sampled before the focused region detection, and up-sampling is applied at the output end of the decision map, thereby reducing 75% of the computational data volume. The experimental results show that the proposed algorithm excels in both focused region extraction and computational efficiency evaluation. It achieves comparable image fusion quality to other algorithms while significantly improving processing efficiency. The average time for multi-focus image fusion with a 4K resolution image on embedded devices is 0.586 s. Compared with traditional algorithms, the proposed method achieves a 94.09% efficiency improvement on embedded devices and a 21.17% efficiency gain on desktop computing platforms. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

11 pages, 779 KiB

Open AccessProceeding Paper

A Novel Approach for Classifying Gliomas from Magnetic Resonance Images Using Image Decomposition and Texture Analysis

by Kunda Suresh Babu, Benjmin Jashva Munigeti, Krishna Santosh Naidana and Sesikala Bapatla

Eng. Proc. 2025, 87(1), 70; https://doi.org/10.3390/engproc2025087070 - 30 May 2025

Viewed by 317

Abstract

Accurate glioma categorization using magnetic resonance (MR) imaging is critical for optimal treatment planning. However, the uneven and diffuse nature of glioma borders makes manual classification difficult and time-consuming. To address these limitations, we provide a unique strategy that combines image decomposition and [...] Read more.

Accurate glioma categorization using magnetic resonance (MR) imaging is critical for optimal treatment planning. However, the uneven and diffuse nature of glioma borders makes manual classification difficult and time-consuming. To address these limitations, we provide a unique strategy that combines image decomposition and local texture feature extraction to improve classification precision. The procedure starts with a Gaussian filter (GF) to smooth and reduce noise in MR images, followed by non-subsampled Laplacian Pyramid (NSLP) decomposition to capture multi-scale image information, making glioma borders more visible, TV-L1 normalization to handle intensity discrepancies, and local binary patterns (LBPs) to extract significant texture features from the processed images, which are then fed into a range of supervised machine learning classifiers, such as support vector machines (SVMs), K-nearest neighbors (KNNs), decision trees (DTs), AdaBoost, and LogitBoost, which have been trained to distinguish between low-grade (LG) and high-grade (HG) gliomas. According to experimental findings, our proposed approach consistently performs better than the state-of-the-art glioma classification techniques, with a higher degree of accuracy in differentiating LG and HG gliomas. This method has the potential to significantly increase diagnostic precision, enabling doctors to make better-informed and efficient treatment choices. Full article

(This article belongs to the Proceedings of The 5th International Electronic Conference on Applied Sciences)

► Show Figures

Figure 1

17 pages, 2144 KiB

Open AccessArticle

DEPANet: A Differentiable Edge-Guided Pyramid Aggregation Network for Strip Steel Surface Defect Segmentation

by Yange Sun, Siyu Geng, Chengyi Zheng, Chenglong Xu, Huaping Guo and Yan Feng

Algorithms 2025, 18(5), 279; https://doi.org/10.3390/a18050279 - 9 May 2025

Viewed by 431

Abstract

The steel strip is an important and ideal material for the automotive and aerospace industries due to its superior machinability, cost efficiency, and flexibility. However, surface defects such as inclusions, spots, and scratches can significantly impact product performance and durability. Accurately identifying these [...] Read more.

The steel strip is an important and ideal material for the automotive and aerospace industries due to its superior machinability, cost efficiency, and flexibility. However, surface defects such as inclusions, spots, and scratches can significantly impact product performance and durability. Accurately identifying these defects remains challenging due to the complex texture structures and subtle variations in the material. In order to tackle this challenge, we propose a Differentiable Edge-guided Pyramid Aggregation Network (DEPANet) to utilize edge information for improving segmentation performance. DEPANet adopts an end-to-end encoder-decoder framework, where the encoder consisting of three key components: a backbone network, a Differentiable Edge Feature Pyramid network (DEFP), and Edge-aware Feature Aggregation Modules (EFAMs). The backbone network is designed to extract overall features from the strip steel surface, while the proposed DEFP utilizes learnable Laplacian operators to extract multiscale edge information of defects across scales. In addition, the proposed EFAMs aggregate the overall features generating from the backbone and the edge information obtained from DEFP using the Convolutional Block Attention Module (CBAM), which combines channel attention and spatial attention mechanisms, to enhance feature expression. Finally, through the decoder, implemented as a Feature Pyramid Network (FPN), the multiscale edge-enhanced features are progressively upsampled and fused to reconstruct high-resolution segmentation maps, enabling precise defect localization and robust handling of defects across various sizes and shapes. DEPANet demonstrates superior segmentation accuracy, edge preservation, and feature representation on the SD-saliency-900 dataset, outperforming other state-of-the-art methods and delivering more precise and reliable defect segmentation. Full article

(This article belongs to the Special Issue Machine Learning Algorithms for Image Understanding and Analysis)

► Show Figures

Figure 1

17 pages, 3914 KiB

Open AccessArticle

Multi-Scale Fusion Underwater Image Enhancement Based on HSV Color Space Equalization

by Jialiang Zhang, Haibing Su, Tao Zhang, Hu Tian and Bin Fan

Sensors 2025, 25(9), 2850; https://doi.org/10.3390/s25092850 - 30 Apr 2025

Viewed by 510

Abstract

Meeting the escalating demand for high-quality underwater imagery poses a significant challenge due to light absorption and scattering in water, resulting in color distortion and reduced contrast. This study presents an innovative approach for enhancing underwater images, combining color correction, HSV color space [...] Read more.

Meeting the escalating demand for high-quality underwater imagery poses a significant challenge due to light absorption and scattering in water, resulting in color distortion and reduced contrast. This study presents an innovative approach for enhancing underwater images, combining color correction, HSV color space equalization, and multi-scale fusion techniques. Initially, automatic contrast adjustment and improved white balance corrected color bias; this was followed by saturation and value equalization in the HSV space to enhance brightness and saturation. Gaussian and Laplacian pyramid methods extracted multi-scale features that were fused to augment image details and edges. Extensive subjective and objective evaluations compared our method with existing algorithms, demonstrating its superior performance in UCIQE (0.64368) and information entropy (7.8041) metrics. The proposed method effectively improves overall image quality, mitigates color bias, and enhances brightness and saturation. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

14 pages, 6013 KiB

Open AccessArticle

FE-P Net: An Image-Enhanced Parallel Density Estimation Network for Meat Duck Counting

by Huanhuan Qin, Wensheng Teng, Mingzhou Lu, Xinwen Chen, Ye Erlan Xieermaola, Saydigul Samat and Tiantian Wang

Appl. Sci. 2025, 15(7), 3840; https://doi.org/10.3390/app15073840 - 1 Apr 2025

Viewed by 442

Abstract

Traditional object detection methods for meat duck counting suffer from high manual costs, low image quality, and varying object sizes. To address these issues, this paper proposes FE-P Net, an image enhancement-based parallel density estimation network that integrates CNNs with Transformer models. FE-P [...] Read more.

Traditional object detection methods for meat duck counting suffer from high manual costs, low image quality, and varying object sizes. To address these issues, this paper proposes FE-P Net, an image enhancement-based parallel density estimation network that integrates CNNs with Transformer models. FE-P Net employs a Laplacian pyramid to extract multi-scale features, effectively reducing the impact of low-resolution images on detection accuracy. Its parallel architecture combines convolutional operations with attention mechanisms, enabling the model to capture both global semantics and local details, thus enhancing its adaptability across diverse density scenarios. The Reconstructed Convolution Module is a crucial component that helps distinguish targets from backgrounds, significantly improving feature extraction accuracy. Validated on a meat duck counting dataset in breeding environments, FE-P Net achieved 96.46% accuracy in large-scale settings, demonstrating state-of-the-art performance. The model shows robustness across various densities, providing valuable insights for poultry counting methods in agricultural contexts. Full article

(This article belongs to the Special Issue Deep Learning and Digital Image Processing)

► Show Figures

Figure 1

12 pages, 1300 KiB

Open AccessArticle

Improving Image Quality of Chest Radiography with Artificial Intelligence-Supported Dual-Energy X-Ray Imaging System: An Observer Preference Study in Healthy Volunteers

by Sung-Hyun Yoon, Jihang Kim, Junghoon Kim, Jong-Hyuk Lee, Ilwoong Choi, Choul-Woo Shin and Chang-Min Park

J. Clin. Med. 2025, 14(6), 2091; https://doi.org/10.3390/jcm14062091 - 19 Mar 2025

Viewed by 1985

Abstract

Background/Objectives: To compare the image quality of chest radiography with a dual-energy X-ray imaging system using AI technology (DE-AI) to that of conventional chest radiography with a standard protocol. Methods: In this prospective study, 52 healthy volunteers underwent dual-energy chest radiography. Images were [...] Read more.

Background/Objectives: To compare the image quality of chest radiography with a dual-energy X-ray imaging system using AI technology (DE-AI) to that of conventional chest radiography with a standard protocol. Methods: In this prospective study, 52 healthy volunteers underwent dual-energy chest radiography. Images were obtained using two exposures at 60 kVp and 120 kVp, separated by a 150 ms interval. Four images were generated for each participant: a conventional image, an enhanced standard image, a soft-tissue-selective image, and a bone-selective image. A machine learning model optimized the cancellation parameters for generating soft-tissue and bone-selective images. To enhance image quality, motion artifacts were minimized using Laplacian pyramid diffeomorphic registration, while a wavelet directional cycle-consistent adversarial network (WavCycleGAN) reduced image noise. Four radiologists independently evaluated the visibility of thirteen anatomical regions (eight soft-tissue regions and five bone regions) and the overall image with a five-point scale of preference. Pooled mean values were calculated for each anatomic region through meta-analysis using a random-effects model. Results: Radiologists preferred DE-AI images to conventional chest radiographs in various anatomic regions. The enhanced standard image showed superior quality in 9 of 13 anatomic regions. Preference for the soft-tissue-selective image was statistically significant for three of eight anatomic regions. Preference for the bone-selective image was statistically significant for four of five anatomic regions. Conclusions: Images produced by DE-AI provide better visualization of thoracic structures. Full article

(This article belongs to the Special Issue New Insights into Lung Imaging)

► Show Figures

Figure 1

28 pages, 6900 KiB

Open AccessArticle

A New Approach to Recognize Faces Amidst Challenges: Fusion Between the Opposite Frequencies of the Multi-Resolution Features

by Regina Lionnie, Julpri Andika and Mudrik Alaydrus

Algorithms 2024, 17(11), 529; https://doi.org/10.3390/a17110529 - 17 Nov 2024

Viewed by 1322

Abstract

This paper proposes a new approach to pixel-level fusion using the opposite frequency from the discrete wavelet transform with Gaussian or Difference of Gaussian. The low-frequency from discrete wavelet transform sub-band was fused with the Difference of Gaussian, while the high-frequency sub-bands were [...] Read more.

This paper proposes a new approach to pixel-level fusion using the opposite frequency from the discrete wavelet transform with Gaussian or Difference of Gaussian. The low-frequency from discrete wavelet transform sub-band was fused with the Difference of Gaussian, while the high-frequency sub-bands were fused with Gaussian. The final fusion was reconstructed using an inverse discrete wavelet transform into one enhanced reconstructed image. These enhanced images were utilized to improve recognition performance in the face recognition system. The proposed method was tested against benchmark face datasets such as The Database of Faces (AT&T), the Extended Yale B Face Dataset, the BeautyREC Face Dataset, and the FEI Face Dataset. The results showed that our proposed method was robust and accurate against challenges such as lighting conditions, facial expressions, head pose, 180-degree rotation of the face profile, dark images, acquisition with time gap, and conditions where the person uses attributes such as glasses. The proposed method is comparable to state-of-the-art methods and generates high recognition performance (more than 99% accuracy). Full article

(This article belongs to the Section Algorithms for Multidisciplinary Applications)

► Show Figures

Figure 1

18 pages, 9078 KiB

Open AccessArticle

MMS-EF: A Multi-Scale Modular Extraction Framework for Enhancing Deep Learning Models in Remote Sensing

by Hang Yu, Weidong Song, Bing Zhang, Hongbo Zhu, Jiguang Dai and Jichao Zhang

Land 2024, 13(11), 1842; https://doi.org/10.3390/land13111842 - 5 Nov 2024

Viewed by 968

Abstract

The analysis of land cover using deep learning techniques plays a pivotal role in understanding land use dynamics, which is crucial for land management, urban planning, and cartography. However, due to the complexity of remote sensing images, deep learning models face practical challenges [...] Read more.

The analysis of land cover using deep learning techniques plays a pivotal role in understanding land use dynamics, which is crucial for land management, urban planning, and cartography. However, due to the complexity of remote sensing images, deep learning models face practical challenges in the preprocessing stage, such as incomplete extraction of large-scale geographic features, loss of fine details, and misalignment issues in image stitching. To address these issues, this paper introduces the Multi-Scale Modular Extraction Framework (MMS-EF) specifically designed to enhance deep learning models in remote sensing applications. The framework incorporates three key components: (1) a multiscale overlapping segmentation module that captures comprehensive geographical information through multi-channel and multiscale processing, ensuring the integrity of large-scale features; (2) a multiscale feature fusion module that integrates local and global features, facilitating seamless image stitching and improving classification accuracy; and (3) a detail enhancement module that refines the extraction of small-scale features, enriching the semantic information of the imagery. Extensive experiments were conducted across various deep learning models, and the framework was validated on two public datasets. The results demonstrate that the proposed approach effectively mitigates the limitations of traditional preprocessing methods, significantly improving feature extraction accuracy and exhibiting strong adaptability across different datasets. Full article

► Show Figures

Figure 1

20 pages, 11204 KiB

Open AccessArticle

Estimating the Spectral Response of Eight-Band MSFA One-Shot Cameras Using Deep Learning

by Pierre Gouton, Kacoutchy Jean Ayikpa and Diarra Mamadou

Algorithms 2024, 17(11), 473; https://doi.org/10.3390/a17110473 - 22 Oct 2024

Viewed by 1268

Abstract

Eight-band one-shot MSFA (multispectral filter array) cameras are innovative technologies used to capture multispectral images by capturing multiple spectral bands simultaneously. They thus make it possible to collect detailed information on the spectral properties of the observed scenes economically. These cameras are widely [...] Read more.

Eight-band one-shot MSFA (multispectral filter array) cameras are innovative technologies used to capture multispectral images by capturing multiple spectral bands simultaneously. They thus make it possible to collect detailed information on the spectral properties of the observed scenes economically. These cameras are widely used for object detection, material analysis, and agronomy. The evolution of one-shot MSFA cameras from 8 to 32 bands makes obtaining much more detailed spectral data possible, which is crucial for applications requiring delicate and precise analysis of the spectral properties of the observed scenes. Our study aims to develop models based on deep learning to estimate the spectral response of this type of camera and provide images close to the spectral properties of objects. First, we prepare our experiment data by projecting them to reflect the characteristics of our camera. Next, we harness the power of deep super-resolution neural networks, such as very deep super-resolution (VDSR), Laplacian pyramid super-resolution networks (LapSRN), and deeply recursive convolutional networks (DRCN), which we adapt to approximate the spectral response. These models learn the complex relationship between 8-band multispectral data from the camera and 31-band multispectral data from the multi-object database, enabling accurate and efficient conversion. Finally, we evaluate the images’ quality using metrics such as loss function, PSNR, and SSIM. The model evaluation revealed that DRCN outperforms others in crucial performance. DRCN achieved the lowest loss with 0.0047 and stood out in image quality metrics, with a PSNR of 25.5059, SSIM of 0.8355, and SAM of 0.13215, indicating better preservation of details and textures. Additionally, DRCN showed the lowest RMSE 0.05849 and MAE 0.0415 values, confirming its ability to minimize reconstruction errors more effectively than VDSR and LapSRN. Full article

(This article belongs to the Special Issue Machine Learning for Pattern Recognition (2nd Edition))

► Show Figures

Figure 1

25 pages, 27745 KiB

Open AccessArticle

Infrared and Visible Image Fusion via Sparse Representation and Guided Filtering in Laplacian Pyramid Domain

by Liangliang Li, Yan Shi, Ming Lv, Zhenhong Jia, Minqin Liu, Xiaobin Zhao, Xueyu Zhang and Hongbing Ma

Remote Sens. 2024, 16(20), 3804; https://doi.org/10.3390/rs16203804 - 13 Oct 2024

Cited by 17 | Viewed by 2558

Abstract

The fusion of infrared and visible images together can fully leverage the respective advantages of each, providing a more comprehensive and richer set of information. This is applicable in various fields such as military surveillance, night navigation, environmental monitoring, etc. In this paper, [...] Read more.

The fusion of infrared and visible images together can fully leverage the respective advantages of each, providing a more comprehensive and richer set of information. This is applicable in various fields such as military surveillance, night navigation, environmental monitoring, etc. In this paper, a novel infrared and visible image fusion method based on sparse representation and guided filtering in Laplacian pyramid (LP) domain is introduced. The source images are decomposed into low- and high-frequency bands by the LP, respectively. Sparse representation has achieved significant effectiveness in image fusion, and it is used to process the low-frequency band; the guided filtering has excellent edge-preserving effects and can effectively maintain the spatial continuity of the high-frequency band. Therefore, guided filtering combined with the weighted sum of eight-neighborhood-based modified Laplacian (WSEML) is used to process high-frequency bands. Finally, the inverse LP transform is used to reconstruct the fused image. We conducted simulation experiments on the publicly available TNO dataset to validate the superiority of our proposed algorithm in fusing infrared and visible images. Our algorithm preserves both the thermal radiation characteristics of the infrared image and the detailed features of the visible image. Full article

(This article belongs to the Special Issue Machine Learning for Intelligent Processing and Applications of Multi-Source Remote Sensing Data)

► Show Figures

Figure 1

18 pages, 4253 KiB

Open AccessArticle

RSTSRN: Recursive Swin Transformer Super-Resolution Network for Mars Images

by Fanlu Wu, Xiaonan Jiang, Tianjiao Fu, Yao Fu, Dongdong Xu and Chunlei Zhao

Appl. Sci. 2024, 14(20), 9286; https://doi.org/10.3390/app14209286 - 12 Oct 2024

Cited by 1 | Viewed by 1520

Abstract

High-resolution optical images will provide planetary geology researchers with finer and more microscopic image data information. In order to maximize scientific output, it is necessary to further increase the resolution of acquired images, so image super-resolution (SR) reconstruction techniques have become the best [...] Read more.

High-resolution optical images will provide planetary geology researchers with finer and more microscopic image data information. In order to maximize scientific output, it is necessary to further increase the resolution of acquired images, so image super-resolution (SR) reconstruction techniques have become the best choice. Aiming at the problems of large parameter quantity and high computational complexity in current deep learning-based image SR reconstruction methods, we propose a novel Recursive Swin Transformer Super-Resolution Network (RSTSRN) for SR applied to images. The RSTSRN improves upon the LapSRN, which we use as our backbone architecture. A Residual Swin Transformer Block (RSTB) is used for more efficient residual learning, which consists of stacked Swin Transformer Blocks (STBs) with a residual connection. Moreover, the idea of parameter sharing was introduced to reduce the number of parameters, and a multi-scale training strategy was designed to accelerate convergence speed. Experimental results show that the proposed RSTSRN achieves superior performance on 2×, 4× and 8×SR tasks to state-of-the-art methods with similar parameters. Especially on high-magnification SR tasks, the RSTSRN has great performance superiority. Compared to the LapSRN network, for 2×, 4× and 8× Mars image SR tasks, the RSTSRN network has increased PSNR values by 0.35 dB, 0.88 dB and 1.22 dB, and SSIM values by 0.0048, 0.0114 and 0.0311, respectively. Full article

(This article belongs to the Special Issue Advances in Image Recognition and Processing Technologies)

► Show Figures

Figure 1

22 pages, 11714 KiB

Open AccessArticle

A Light-Weight Self-Supervised Infrared Image Perception Enhancement Method

by Yifan Xiao, Zhilong Zhang and Zhouli Li

Electronics 2024, 13(18), 3695; https://doi.org/10.3390/electronics13183695 - 18 Sep 2024

Cited by 1 | Viewed by 1367

Abstract

Convolutional Neural Networks (

C N N s

) have achieved remarkable results in the field of infrared image enhancement. However, the research on the visual perception mechanism and the objective evaluation indicators for enhanced infrared images is still not in-depth enough. To [...] Read more.

Convolutional Neural Networks (

C N N s

) have achieved remarkable results in the field of infrared image enhancement. However, the research on the visual perception mechanism and the objective evaluation indicators for enhanced infrared images is still not in-depth enough. To make the subjective and objective evaluation more consistent, this paper uses a perceptual metric to evaluate the enhancement effect of infrared images. The perceptual metric mimics the early conversion process of the human visual system and uses the normalized Laplacian pyramid distance (

N L P D

) between the enhanced image and the original scene radiance to evaluate the image enhancement effect. Based on this, this paper designs an infrared image-enhancement algorithm that is more conducive to human visual perception. The algorithm uses a lightweight Fully Convolutional Network (

F C N

), with

N L P D

as the similarity measure, and trains the network in a self-supervised manner by minimizing the

N L P D

between the enhanced image and the original scene radiance to achieve infrared image enhancement. The experimental results show that the infrared image enhancement method in this paper outperforms existing methods in terms of visual perception quality, and due to the use of a lightweight network, it is also the fastest enhancement method currently. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)

► Show Figures

Figure 1

19 pages, 6395 KiB

Open AccessArticle

Dmg2Former-AR: Vision Transformers with Adaptive Rescaling for High-Resolution Structural Visual Inspection

by Kareem Eltouny, Seyedomid Sajedi and Xiao Liang

Sensors 2024, 24(18), 6007; https://doi.org/10.3390/s24186007 - 17 Sep 2024

Cited by 2 | Viewed by 1809

Abstract

Developments in drones and imaging hardware technology have opened up countless possibilities for enhancing structural condition assessments and visual inspections. However, processing the inspection images requires considerable work hours, leading to delays in the assessment process. This study presents a semantic segmentation architecture [...] Read more.

Developments in drones and imaging hardware technology have opened up countless possibilities for enhancing structural condition assessments and visual inspections. However, processing the inspection images requires considerable work hours, leading to delays in the assessment process. This study presents a semantic segmentation architecture that integrates vision transformers with Laplacian pyramid scaling networks, enabling rapid and accurate pixel-level damage detection. Unlike conventional methods that often lose critical details through resampling or cropping high-resolution images, our approach preserves essential inspection-related information such as microcracks and edges using non-uniform image rescaling networks. This innovation allows for detailed damage identification of high-resolution images while significantly reducing the computational demands. Our main contributions in this study are: (1) proposing two rescaling networks that together allow for processing high-resolution images while significantly reducing the computational demands; and (2) proposing Dmg2Former, a low-resolution segmentation network with a Swin Transformer backbone that leverages the saved computational resources to produce detailed visual inspection masks. We validate our method through a series of experiments on publicly available visual inspection datasets, addressing various tasks such as crack detection and material identification. Finally, we examine the computational efficiency of the adaptive rescalers in terms of multiply–accumulate operations and GPU-memory requirements. Full article

(This article belongs to the Special Issue Feature Papers in Fault Diagnosis & Sensors 2024)

► Show Figures

Figure 1

22 pages, 30798 KiB

Open AccessArticle

Underwater Image Enhancement Fusion Method Guided by Salient Region Detection

by Jiawei Yang, Hongwu Huang, Fanchao Lin, Xiujing Gao, Junjie Jin and Biwen Zhang

J. Mar. Sci. Eng. 2024, 12(8), 1383; https://doi.org/10.3390/jmse12081383 - 13 Aug 2024

Cited by 5 | Viewed by 2827

Abstract

Exploring and monitoring underwater environments pose unique challenges due to water’s complex optical properties, which significantly impact image quality. Challenges like light absorption and scattering result in color distortion and decreased visibility. Traditional underwater image acquisition methods face these obstacles, highlighting the need [...] Read more.

Exploring and monitoring underwater environments pose unique challenges due to water’s complex optical properties, which significantly impact image quality. Challenges like light absorption and scattering result in color distortion and decreased visibility. Traditional underwater image acquisition methods face these obstacles, highlighting the need for advanced techniques to solve the image color shift and image detail loss caused by the underwater environment in the image enhancement process. This study proposes a salient region-guided underwater image enhancement fusion method to alleviate these problems. First, this study proposes an advanced dark channel prior method to reduce haze effects in underwater images, significantly improving visibility and detail. Subsequently, a comprehensive RGB color correction restores the underwater scene’s natural appearance. The innovation of our method is that it fuses through a combination of Laplacian and Gaussian pyramids, guided by salient region coefficients, thus preserving and accentuating the visually significant elements of the underwater environment. Comprehensive subjective and objective evaluations demonstrate our method’s superior performance in enhancing contrast, color depth, and overall visual quality compared to existing methods. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

20 pages, 6060 KiB

Open AccessArticle

Lightweight Frequency Recalibration Network for Diabetic Retinopathy Multi-Lesion Segmentation

by Yinghua Fu, Mangmang Liu, Ge Zhang and Jiansheng Peng

Appl. Sci. 2024, 14(16), 6941; https://doi.org/10.3390/app14166941 - 8 Aug 2024

Cited by 9 | Viewed by 1385

Abstract

Automated segmentation of diabetic retinopathy (DR) lesions is crucial for assessing DR severity and diagnosis. Most previous segmentation methods overlook the detrimental impact of texture information bias, resulting in suboptimal segmentation results. Additionally, the role of lesion shape is not thoroughly considered. In [...] Read more.

Automated segmentation of diabetic retinopathy (DR) lesions is crucial for assessing DR severity and diagnosis. Most previous segmentation methods overlook the detrimental impact of texture information bias, resulting in suboptimal segmentation results. Additionally, the role of lesion shape is not thoroughly considered. In this paper, we propose a lightweight frequency recalibration network (LFRC-Net) for simultaneous multi-lesion DR segmentation, which integrates a frequency recalibration module into the bottleneck layers of the encoder to analyze texture information and shape features together. The module utilizes a Gaussian pyramid to generate features at different scales, constructs a Laplacian pyramid using a difference of Gaussian filter, and then analyzes object features in different frequency domains with the Laplacian pyramid. The high-frequency component handles texture information, while the low-frequency area focuses on learning the shape features of DR lesions. By adaptively recalibrating these frequency representations, our method can differentiate the objects of interest. In the decoder, we introduce a residual attention module (RAM) to enhance lesion feature extraction and efficiently suppress irrelevant information. We evaluate the proposed model’s segmentation performance on two public datasets, IDRiD and DDR, and a private dataset, an ultra-wide-field fundus images dataset. Extensive comparative experiments and ablation studies are conducted across multiple datasets. With minimal model parameters, our approach achieves an mAP_PR of 60.51%, 34.83%, and 14.35% for the segmentation of EX, HE, and MA on the DDR dataset and also obtains excellent results for EX and SE on the IDRiD dataset, which validates the effectiveness of our network. Full article

► Show Figures

Figure 1

Search Results (58)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (58)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI