MDPI - Publisher of Open Access Journals

21 pages, 9995 KB

Open AccessArticle

HCNet: Multi-Exposure High-Dynamic-Range Reconstruction Network for Coded Aperture Snapshot Spectral Imaging

by Hang Shi, Jingxia Chen, Yahui Li, Pengwei Zhang and Jinshou Tian

Sensors 2026, 26(1), 337; https://doi.org/10.3390/s26010337 - 5 Jan 2026

Viewed by 1055

Coded Aperture Snapshot Spectral Imaging (CASSI) is a rapid hyperspectral imaging technique with broad application prospects. Due to limitations in three-dimensional compressed data acquisition modes and hardware constraints, the compressed measurements output by actual CASSI systems have a finite dynamic range, leading to [...] Read more.

Coded Aperture Snapshot Spectral Imaging (CASSI) is a rapid hyperspectral imaging technique with broad application prospects. Due to limitations in three-dimensional compressed data acquisition modes and hardware constraints, the compressed measurements output by actual CASSI systems have a finite dynamic range, leading to degraded hyperspectral reconstruction quality. To address this issue, a high-quality hyperspectral reconstruction method based on multi-exposure fusion is proposed. A multi-exposure data acquisition strategy is established to capture low-, medium-, and high-exposure low-dynamic-range (LDR) measurements. A multi-exposure fusion-based high-dynamic-range (HDR) CASSI measurement reconstruction network (HCNet) is designed to reconstruct physically consistent HDR measurement images. Unlike traditional HDR networks for visual enhancement, HCNet employs a multiscale feature fusion architecture and combines local–global convolutional joint attention with residual enhancement mechanisms to efficiently fuse complementary information from multiple exposures. This makes it more suitable for CASSI systems, ensuring high-fidelity reconstruction of hyperspectral data in both spatial and spectral dimensions. A multi-exposure fusion CASSI mathematical model is constructed, and a CASSI experimental system is established. Simulation and real-world experimental results demonstrate that the proposed method significantly improves hyperspectral image reconstruction quality compared to traditional single-exposure strategies, exhibiting high robustness against multi-exposure interval jitters and shot noise in practical systems. Leveraging the higher-dynamic-range target information acquired through multiple exposures, especially in HDR scenes, the method enables reconstruction with enhanced contrast in both bright and dark details and also demonstrates higher spectral correlation, validating the enhancement of CASSI reconstruction and effective measurement capability in HDR scenarios. Full article

(This article belongs to the Section Optical Sensors)

► Show Figures

Figure 1

32 pages, 29223 KB

Open AccessFeature PaperArticle

Variance-Driven U-Net Weighted Training and Chroma-Scale-Based Multi-Exposure Image Fusion

by Chang-Woo Son, Young-Ho Go, Seung-Hwan Lee and Sung-Hak Lee

Mathematics 2025, 13(22), 3629; https://doi.org/10.3390/math13223629 - 12 Nov 2025

Viewed by 839

Abstract

Multi-exposure image fusion (MEF) aims to generate a well-exposed image by combining multiple photographs captured at different exposure levels. However, deep learning-based approaches are often highly dependent on the quality of the training data, which can lead to inconsistent color reproduction and loss [...] Read more.

Multi-exposure image fusion (MEF) aims to generate a well-exposed image by combining multiple photographs captured at different exposure levels. However, deep learning-based approaches are often highly dependent on the quality of the training data, which can lead to inconsistent color reproduction and loss of fine details. To address this issue, this study proposes a variance-driven hybrid MEF framework based on a U-Net architecture, which adaptively balances structural and chromatic information. In the proposed method, the variance of randomly cropped patches is used as a training weight, allowing the model to emphasize structurally informative regions and thereby preserve local details during the fusion process. Furthermore, a fusion strategy based on the geometric color distance, referred to as the Chroma scale, in the LAB color space is applied to preserve the original chroma characteristics of the input images and improve color fidelity. Visual gamma compensation is also employed to maintain perceptual luminance consistency and synthesize a natural fine image with balanced tone and smooth contrast transitions. Experiments conducted on 86 exposure pairs demonstrate that the proposed model achieves superior fusion quality compared with conventional and deep-learning-based methods, obtaining high JNBM (17.91) and HyperIQA (70.37) scores. Overall, the proposed variance-driven U-Net effectively mitigates dataset dependency and color distortion, providing a reliable and computationally efficient solution for robust MEF applications. Full article

(This article belongs to the Special Issue Image Processing and Machine Learning with Applications)

► Show Figures

Figure 1

14 pages, 7476 KB

Open AccessArticle

Development of 3D-Stacked 1Megapixel Dual-Time-Gated SPAD Image Sensor with Simultaneous Dual Image Output Architecture for Efficient Sensor Fusion

by Kazuma Chida, Kazuhiro Morimoto, Naoki Isoda, Hiroshi Sekine, Tomoya Sasago, Yu Maehashi, Satoru Mikajiri, Kenzo Tojima, Mahito Shinohara, Ayman T. Abdelghafar, Hiroyuki Tsuchiya, Kazuma Inoue, Satoshi Omodani, Alice Ehara, Junji Iwata, Tetsuya Itano, Yasushi Matsuno, Katsuhito Sakurai and Takeshi Ichikawa

Sensors 2025, 25(21), 6563; https://doi.org/10.3390/s25216563 - 24 Oct 2025

Cited by 5 | Viewed by 2140

Abstract

Sensor fusion is crucial in numerous imaging and sensing applications. Integrating data from multiple sensors with different field-of-view, resolution, and frame timing poses substantial computational overhead. Time-gated single-photon avalanche diode (SPAD) image sensors have been developed to support multiple sensing modalities and mitigate [...] Read more.

Sensor fusion is crucial in numerous imaging and sensing applications. Integrating data from multiple sensors with different field-of-view, resolution, and frame timing poses substantial computational overhead. Time-gated single-photon avalanche diode (SPAD) image sensors have been developed to support multiple sensing modalities and mitigate this issue, but mismatched frame timing remains a challenge. Dual-time-gated SPAD image sensors, which can capture dual images simultaneously, have also been developed. However, the reported sensors suffered from medium-to-large pixel pitch, limited resolution, and inability to independently control the exposure time of the dual images, which restricts their applicability. In this paper, we introduce a 5 µm-pitch, 3D-backside-illuminated (BSI) 1Megapixel dual-time-gated SPAD image sensor enabling a simultaneous output of dual images. The developed SPAD image sensor is verified to operate as an RGB-Depth (RGB-D) sensor without complex image alignment. In addition, a novel high dynamic range (HDR) technique, utilizing pileup effect with two parallel in-pixel memories, is validated for dynamic range extension in 2D imaging, achieving a dynamic range of 119.5 dB. The proposed architecture provides dual image output with the same field-of-view, resolution, and frame timing, and is promising for efficient sensor fusion. Full article

(This article belongs to the Special Issue Special Issue on the 2025 International Image Sensor Workshop (IISW2025))

► Show Figures

Figure 1

38 pages, 5218 KB

Open AccessArticle

Improved YOLO-Based Corrosion Detection and Coating Performance Evaluation Under Marine Exposure in Zhoushan, China

by Qifeng Yu, Yudong Han, Xukun Huang and Xinjia Gao

J. Mar. Sci. Eng. 2025, 13(10), 1842; https://doi.org/10.3390/jmse13101842 - 23 Sep 2025

Cited by 2 | Viewed by 1640

Abstract

In response to the challenges of metal corrosion detection and anti-corrosion coating performance evaluation in the marine environment of Zhoushan, this study proposes an improved object detection model, YOLO v5-EfficientViT-NWD-CCA, to enhance the recognition accuracy and detection efficiency of corrosion images on marine [...] Read more.

In response to the challenges of metal corrosion detection and anti-corrosion coating performance evaluation in the marine environment of Zhoushan, this study proposes an improved object detection model, YOLO v5-EfficientViT-NWD-CCA, to enhance the recognition accuracy and detection efficiency of corrosion images on marine structures. Based on YOLO v5, the model incorporates the EfficientViT backbone network, NWD (Normalized Wasserstein Distance) loss function, and CCA (Criss-Cross Attention) attention mechanism, outperforming comparative models across multiple key metrics. Experimental results show that the proposed model increases precision from 0.73 to 0.76 (approximately 4% improvement) and raises the True Positive rate from 0.66 to 0.70 (approximately 6% improvement) according to the confusion matrix, demonstrating more stable overall detection performance. Building on this, the study combines the model’s detection results to conduct a quantitative analysis of the corrosion area of eight types of anti-corrosion coatings in two typical marine environments—tidal zones and fully immersed zones—across different exposure periods (24, 60, and 96 months). The results indicate that the tidal zone presents a harsher corrosion environment, with corrosion severity significantly increasing over time. Fusion-bonded epoxy coatings, powder epoxy coatings, and fluorocarbon coatings exhibit good corrosion resistance, whereas chlorinated rubber coatings and conventional epoxy coatings perform poorly. This study not only achieves intelligent identification and precise quantification of corrosion areas but also provides a scientific basis for the selection and evaluation of anti-corrosion coatings in different marine environments. Full article

(This article belongs to the Special Issue Advanced Research on the Sustainable Maritime Transportation (2nd Edition))

► Show Figures

Figure 1

35 pages, 47811 KB

Open AccessFeature PaperArticle

Single-Exposure HDR Image Translation via Synthetic Wide-Band Characteristics Reflected Image Training

by Seung Hwan Lee and Sung Hak Lee

Mathematics 2025, 13(16), 2644; https://doi.org/10.3390/math13162644 - 17 Aug 2025

Viewed by 1590

Abstract

High dynamic range (HDR) tone mapping techniques have been widely studied to effectively represent the broad dynamic range of real-world scenes. However, generating an HDR image from multiple low dynamic range (LDR) images captured at different exposure levels can introduce ghosting artifacts in [...] Read more.

High dynamic range (HDR) tone mapping techniques have been widely studied to effectively represent the broad dynamic range of real-world scenes. However, generating an HDR image from multiple low dynamic range (LDR) images captured at different exposure levels can introduce ghosting artifacts in dynamic scenes. Moreover, methods that estimate HDR information from a single LDR image often suffer from inherent accuracy limitations. To overcome these limitations, this study proposes a novel image processing technique that extends the dynamic range of a single LDR image. This technique achieves the goal through leveraging a Convolutional Neural Network (CNN) to generate a synthetic Near-Infrared (NIR) image—one that emulates the characteristic of real NIR imagery being less susceptible to diffraction, thus preserving sharper outlines and clearer details. This synthetic NIR image is then fused with the original LDR image, which contains color information, to create a tone-distributed HDR-like image. The synthetic NIR image is produced using a lightweight U-Net-based autoencoder, where the encoder extracts features from the LDR image, and the decoder synthesizes a synthetic NIR image that replicates the characteristics of a real NIR image. To enhance feature fusion, a cardinality structure inspired by Extended-Efficient Layer Aggregation Networks (E-ELAN) in You Only Look Once Version 7 (YOLOv7) and a modified convolutional block attention module (CBAM) incorporating a difference map are applied. The loss function integrates a discriminator to enforce adversarial loss, while VGG, structural similarity index, and mean squared error losses contribute to overall image fidelity. Additionally, non-reference image quality assessment losses based on BRISQUE and NIQE are incorporated to further refine image quality. Experimental results demonstrate that the proposed method outperforms conventional HDR techniques in both qualitative and quantitative evaluations. Full article

(This article belongs to the Special Issue Advances in Artificial Intelligence: Models, Optimization, and Machine Learning, 3rd Edition)

► Show Figures

Figure 1

17 pages, 74988 KB

Open AccessArticle

EDMF: A New Benchmark for Multi-Focus Images with the Challenge of Exposure Difference

by Hui Li, Tianyu Shen, Zeyang Zhang, Xuefeng Zhu and Xiaoning Song

Sensors 2024, 24(22), 7287; https://doi.org/10.3390/s24227287 - 14 Nov 2024

Cited by 4 | Viewed by 2333

Abstract

The goal of the multi-focus image fusion (MFIF) task is to merge images with different focus areas into a single clear image. In real world scenarios, in addition to varying focus attributes, there are also exposure differences between multi-source images, which is an [...] Read more.

The goal of the multi-focus image fusion (MFIF) task is to merge images with different focus areas into a single clear image. In real world scenarios, in addition to varying focus attributes, there are also exposure differences between multi-source images, which is an important but often overlooked issue. To address this drawback and improve the development of the MFIF task, a new image fusion dataset is introduced called EDMF. Compared with the existing public MFIF datasets, it contains more images with exposure differences, which is more challenging and has a numerical advantage. Specifically, EDMF contains 1000 pairs of color images captured in real-world scenes, with some pairs exhibiting significant exposure difference. These images are captured using smartphones, encompassing diverse scenes and lighting conditions. Additionally, in this paper, a baseline method is also proposed, which is an improved version of memory unit-based unsupervised learning. By incorporating multiple adaptive memory units and spatial frequency information, the network is guided to focus on learning features from in-focus areas. This approach enables the network to effectively learn focus features during training, resulting in clear fused images that align with human visual perception. Experimental results demonstrate the effectiveness of the proposed method in handling exposure difference, achieving excellent fusion results in various complex scenes. Full article

(This article belongs to the Topic Applied Computer Vision and Pattern Recognition: 2nd Edition)

► Show Figures

Figure 1

30 pages, 26891 KB

Open AccessFeature PaperArticle

Multiexposed Image-Fusion Strategy Using Mutual Image Translation Learning with Multiscale Surround Switching Maps

by Young-Ho Go, Seung-Hwan Lee and Sung-Hak Lee

Mathematics 2024, 12(20), 3244; https://doi.org/10.3390/math12203244 - 16 Oct 2024

Cited by 3 | Viewed by 2039

Abstract

The dynamic range of an image represents the difference between its darkest and brightest areas, a crucial concept in digital image processing and computer vision. Despite display technology advancements, replicating the broad dynamic range of the human visual system remains challenging, necessitating high [...] Read more.

The dynamic range of an image represents the difference between its darkest and brightest areas, a crucial concept in digital image processing and computer vision. Despite display technology advancements, replicating the broad dynamic range of the human visual system remains challenging, necessitating high dynamic range (HDR) synthesis, combining multiple low dynamic range images captured at contrasting exposure levels to generate a single HDR image that integrates the optimal exposure regions. Recent deep learning advancements have introduced innovative approaches to HDR generation, with the cycle-consistent generative adversarial network (CycleGAN) gaining attention due to its robustness against domain shifts and ability to preserve content style while enhancing image quality. However, traditional CycleGAN methods often rely on unpaired datasets, limiting their capacity for detail preservation. This study proposes an improved model by incorporating a switching map (SMap) as an additional channel in the CycleGAN generator using paired datasets. The SMap focuses on essential regions, guiding weighted learning to minimize the loss of detail during synthesis. Using translated images to estimate the middle exposure integrates these images into HDR synthesis, reducing unnatural transitions and halo artifacts that could occur at boundaries between various exposures. The multilayered application of the retinex algorithm captures exposure variations, achieving natural and detailed tone mapping. The proposed mutual image translation module extends CycleGAN, demonstrating superior performance in multiexposure fusion and image translation, significantly enhancing HDR image quality. The image quality evaluation indices used are CPBDM, JNBM, LPC-SI, S3, JPEG_2000, and SSEQ, and the proposed model exhibits superior performance compared to existing methods, recording average scores of 0.6196, 15.4142, 0.9642, 0.2838, 80.239, and 25.054, respectively. Therefore, based on qualitative and quantitative results, this study demonstrates the superiority of the proposed model. Full article

(This article belongs to the Special Issue Computer Vision, Image Processing Technologies and Artificial Intelligence, 2nd Edition)

► Show Figures

Figure 1

26 pages, 23127 KB

Open AccessArticle

MEFSR-GAN: A Multi-Exposure Feedback and Super-Resolution Multitask Network via Generative Adversarial Networks

by Sibo Yu, Kun Wu, Guang Zhang, Wanhong Yan, Xiaodong Wang and Chen Tao

Remote Sens. 2024, 16(18), 3501; https://doi.org/10.3390/rs16183501 - 21 Sep 2024

Cited by 5 | Viewed by 1875

Abstract

In applications such as satellite remote sensing and aerial photography, imaging equipment must capture brightness information of different ground scenes within a restricted dynamic range. Due to camera sensor limitations, captured images can represent only a portion of such information, which results in [...] Read more.

In applications such as satellite remote sensing and aerial photography, imaging equipment must capture brightness information of different ground scenes within a restricted dynamic range. Due to camera sensor limitations, captured images can represent only a portion of such information, which results in lower resolution and lower dynamic range compared with real scenes. Image super resolution (SR) and multiple-exposure image fusion (MEF) are commonly employed technologies to address these issues. Nonetheless, these two problems are often researched in separate directions. In this paper, we propose MEFSR-GAN: an end-to-end framework based on generative adversarial networks that simultaneously combines super-resolution and multiple-exposure fusion. MEFSR-GAN includes a generator and two discriminators. The generator network consists of two parallel sub-networks for under-exposure and over-exposure, each containing a feature extraction block (FEB), a super-resolution block (SRB), and several multiple-exposure feedback blocks (MEFBs). It processes low-resolution under- and over-exposed images to produce high-resolution high dynamic range (HDR) images. These images are evaluated by two discriminator networks, driving the generator to generate realistic high-resolution HDR outputs through multi-goal training. Extensive qualitative and quantitative experiments were conducted on the SICE dataset, yielding a PSNR of 24.821 and an SSIM of 0.896 for 2× upscaling. These results demonstrate that MEFSR-GAN outperforms existing methods in terms of both visual effects and objective evaluation metrics, thereby establishing itself as a state-of-the-art technology. Full article

(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing-III)

► Show Figures

Figure 1

14 pages, 5273 KB

Open AccessArticle

Mask Mixup Model: Enhanced Contrastive Learning for Few-Shot Learning

by Kai Xie, Yuxuan Gao, Yadang Chen and Xun Che

Appl. Sci. 2024, 14(14), 6063; https://doi.org/10.3390/app14146063 - 11 Jul 2024

Cited by 1 | Viewed by 2371

Abstract

Few-shot image classification aims to improve the performance of traditional image classification when faced with limited data. Its main challenge lies in effectively utilizing sparse sample label data to accurately predict the true feature distribution. Recent approaches have employed data augmentation techniques like [...] Read more.

Few-shot image classification aims to improve the performance of traditional image classification when faced with limited data. Its main challenge lies in effectively utilizing sparse sample label data to accurately predict the true feature distribution. Recent approaches have employed data augmentation techniques like random Mask or mixture interpolation to enhance the diversity and generalization of labeled samples. However, these methods still encounter several issues: (1) random Mask can lead to complete blockage or exposure of foreground, causing loss of crucial sample information; and (2) uniform data distribution after mixture interpolation makes it difficult for the model to differentiate between different categories and effectively distinguish their boundaries. To address these challenges, this paper introduces a novel data augmentation method based on saliency mask blending. Firstly, it selectively preserves key image features through adaptive selection and retention using visual feature occlusion fusion and confidence clipping strategies. Secondly, a visual feature saliency fusion approach is employed to calculate the importance of various image regions, guiding the blending process to produce more diverse and enriched images with clearer category boundaries. The proposed method achieves outstanding performance on multiple standard few-shot image classification datasets (miniImageNet, tieredImageNet, Few-shot FC100, and CUB), surpassing state-of-the-art methods by approximately 0.2–1%. Full article

► Show Figures

Figure 1

21 pages, 22780 KB

Open AccessArticle

Ref-MEF: Reference-Guided Flexible Gated Image Reconstruction Network for Multi-Exposure Image Fusion

by Yuhui Huang, Shangbo Zhou, Yufen Xu, Yijia Chen and Kai Cao

Entropy 2024, 26(2), 139; https://doi.org/10.3390/e26020139 - 3 Feb 2024

Cited by 1 | Viewed by 3540

Abstract

Multi-exposure image fusion (MEF) is a computational approach that amalgamates multiple images, each captured at varying exposure levels, into a singular, high-quality image that faithfully encapsulates the visual information from all the contributing images. Deep learning-based MEF methodologies often confront obstacles due to [...] Read more.

Multi-exposure image fusion (MEF) is a computational approach that amalgamates multiple images, each captured at varying exposure levels, into a singular, high-quality image that faithfully encapsulates the visual information from all the contributing images. Deep learning-based MEF methodologies often confront obstacles due to the inherent inflexibilities of neural network structures, presenting difficulties in dynamically handling an unpredictable amount of exposure inputs. In response to this challenge, we introduce Ref-MEF, a method for color image multi-exposure fusion guided by a reference image designed to deal with an uncertain amount of inputs. We establish a reference-guided exposure correction (REC) module based on channel attention and spatial attention, which can correct input features and enhance pre-extraction features. The exposure-guided feature fusion (EGFF) module combines original image information and uses Gaussian filter weights for feature fusion while keeping the feature dimensions constant. The image reconstruction is completed through a gated context aggregation network (GCAN) and global residual learning GRL. Our refined loss function incorporates gradient fidelity, producing high dynamic range images that are rich in detail and demonstrate superior visual quality. In evaluation metrics focused on image features, our method exhibits significant superiority and leads in holistic assessments as well. It is worth emphasizing that as the number of input images increases, our algorithm exhibits notable computational efficiency. Full article

(This article belongs to the Topic Color Image Processing: Models and Methods (CIP: MM))

► Show Figures

Figure 1

18 pages, 4644 KB

Open AccessArticle

Synthetic 3D Spinal Vertebrae Reconstruction from Biplanar X-rays Utilizing Generative Adversarial Networks

by Babak Saravi, Hamza Eren Guzel, Alisia Zink, Sara Ülkümen, Sebastien Couillard-Despres, Jakob Wollborn, Gernot Lang and Frank Hassel

J. Pers. Med. 2023, 13(12), 1642; https://doi.org/10.3390/jpm13121642 - 24 Nov 2023

Cited by 18 | Viewed by 3991

Abstract

Computed tomography (CT) offers detailed insights into the internal anatomy of patients, particularly for spinal vertebrae examination. However, CT scans are associated with higher radiation exposure and cost compared to conventional X-ray imaging. In this study, we applied a Generative Adversarial Network (GAN) [...] Read more.

Computed tomography (CT) offers detailed insights into the internal anatomy of patients, particularly for spinal vertebrae examination. However, CT scans are associated with higher radiation exposure and cost compared to conventional X-ray imaging. In this study, we applied a Generative Adversarial Network (GAN) framework to reconstruct 3D spinal vertebrae structures from synthetic biplanar X-ray images, specifically focusing on anterior and lateral views. The synthetic X-ray images were generated using the DRRGenerator module in 3D Slicer by incorporating segmentations of spinal vertebrae in CT scans for the region of interest. This approach leverages a novel feature fusion technique based on X2CT-GAN to combine information from both views and employs a combination of mean squared error (MSE) loss and adversarial loss to train the generator, resulting in high-quality synthetic 3D spinal vertebrae CTs. A total of n = 440 CT data were processed. We evaluated the performance of our model using multiple metrics, including mean absolute error (MAE) (for each slice of the 3D volume (MAE0) and for the entire 3D volume (MAE)), cosine similarity, peak signal-to-noise ratio (PSNR), 3D peak signal-to-noise ratio (PSNR-3D), and structural similarity index (SSIM). The average PSNR was 28.394 dB, PSNR-3D was 27.432, SSIM was 0.468, cosine similarity was 0.484, MAE0 was 0.034, and MAE was 85.359. The results demonstrated the effectiveness of this approach in reconstructing 3D spinal vertebrae structures from biplanar X-rays, although some limitations in accurately capturing the fine bone structures and maintaining the precise morphology of the vertebrae were present. This technique has the potential to enhance the diagnostic capabilities of low-cost X-ray machines while reducing radiation exposure and cost associated with CT scans, paving the way for future applications in spinal imaging and diagnosis. Full article

(This article belongs to the Special Issue Novel Challenges and Advances in Orthopaedic and Trauma Surgery)

► Show Figures

Figure 1

17 pages, 4088 KB

Open AccessArticle

Enhanced LDR Detail Rendering for HDR Fusion by TransU-Fusion Network

by Bo Song, Rui Gao, Yong Wang and Qi Yu

Symmetry 2023, 15(7), 1463; https://doi.org/10.3390/sym15071463 - 23 Jul 2023

Cited by 2 | Viewed by 2592

Abstract

High Dynamic Range (HDR) images are widely used in automotive, aerospace, AI, and other fields but are limited by the maximum dynamic range of a single data acquisition using CMOS image sensors. High dynamic range images are usually synthesized through multiple exposure techniques [...] Read more.

High Dynamic Range (HDR) images are widely used in automotive, aerospace, AI, and other fields but are limited by the maximum dynamic range of a single data acquisition using CMOS image sensors. High dynamic range images are usually synthesized through multiple exposure techniques and image processing techniques. One of the most challenging task in multiframe Low Dynamic Range (LDR) images fusion for HDR is to eliminate ghosting artifacts caused by motion. In traditional algorithms, optical flow is generally used to align dynamic scenes before image fusion, which can achieve good results in cases of small-scale motion scenes but causes obvious ghosting artifacts when motion magnitude is large. Recently, attention mechanisms have been introduced during the alignment stage to enhance the network’s ability to remove ghosts. However, significant ghosting artifacts still occur in some scenarios with large-scale motion or oversaturated areas. We proposea novel Distilled Feature TransformerBlock (DFTB) structure to distill and re-extract information from deep image features obtained after U-Net downsampling, achieving ghost removal at the semantic level for HDR fusion. We introduce a Feature Distillation Transformer Block (FDTB), based on the Swin-Transformer and RFDB structure. FDTB uses multiple distillation connections to learn more discriminative feature representations. For the multiexposure moving scene image fusion HDR ghost removal task, in the previous method, the use of deep learning to remove the ghost effect in the composite image has been perfect, and it is almost difficult to observe the ghost residue of moving objects in the composite HDR image. The method in this paper focuses more on how to save the details of LDR image more completely after removing the ghost to synthesize high-quality HDR image. After using the proposed FDTB, the edge texture details of the synthesized HDR image are saved more perfectly, which shows that FDTB has a better effect in saving the details of image fusion. Futhermore, we propose a new depth framework based on DFTB for fusing and removing ghosts from deep image features, called TransU-Fusion. First of all, we use the encoder in U-Net to extract image features of different exposures and map them to different dimensional feature spaces. By utilizing the symmetry of the U-Net structure, we can ultimately output these feature images as original size HDR images. Then, we further fuse high-dimensional space features using Dilated Residual Dense Block (DRDB) to expand the receptive field, which is beneficial for repairing over-saturated regions. We use the transformer in DFTB to perform low-pass filtering on low-dimensional space features and interact with global information to remove ghosts. Finally, the processed features are merged and output as an HDR image without ghosting artifacts through the decoder. After testing on datasets and comparing with benchmark and state-of-the-art models, the results demonstrate our model’s excellent information fusion ability and stronger ghost removal capability. Full article

(This article belongs to the Special Issue Symmetry in Probablistic Models and Aerospace Systems)

► Show Figures

Figure 1

33 pages, 6761 KB

Open AccessArticle

Hybrid Models Based on Fusion Features of a CNN and Handcrafted Features for Accurate Histopathological Image Analysis for Diagnosing Malignant Lymphomas

by Mohammed Hamdi, Ebrahim Mohammed Senan, Mukti E. Jadhav, Fekry Olayah, Bakri Awaji and Khaled M. Alalayah

Diagnostics 2023, 13(13), 2258; https://doi.org/10.3390/diagnostics13132258 - 4 Jul 2023

Cited by 36 | Viewed by 4180

Abstract

Malignant lymphoma is one of the most severe types of disease that leads to death as a result of exposure of lymphocytes to malignant tumors. The transformation of cells from indolent B-cell lymphoma to B-cell lymphoma (DBCL) is life-threatening. Biopsies taken from the [...] Read more.

Malignant lymphoma is one of the most severe types of disease that leads to death as a result of exposure of lymphocytes to malignant tumors. The transformation of cells from indolent B-cell lymphoma to B-cell lymphoma (DBCL) is life-threatening. Biopsies taken from the patient are the gold standard for lymphoma analysis. Glass slides under a microscope are converted into whole slide images (WSI) to be analyzed by AI techniques through biomedical image processing. Because of the multiplicity of types of malignant lymphomas, manual diagnosis by pathologists is difficult, tedious, and subject to disagreement among physicians. The importance of artificial intelligence (AI) in the early diagnosis of malignant lymphoma is significant and has revolutionized the field of oncology. The use of AI in the early diagnosis of malignant lymphoma offers numerous benefits, including improved accuracy, faster diagnosis, and risk stratification. This study developed several strategies based on hybrid systems to analyze histopathological images of malignant lymphomas. For all proposed models, the images and extraction of malignant lymphocytes were optimized by the gradient vector flow (GVF) algorithm. The first strategy for diagnosing malignant lymphoma images relied on a hybrid system between three types of deep learning (DL) networks, XGBoost algorithms, and decision tree (DT) algorithms based on the GVF algorithm. The second strategy for diagnosing malignant lymphoma images was based on fusing the features of the MobileNet-VGG16, VGG16-AlexNet, and MobileNet-AlexNet models and classifying them by XGBoost and DT algorithms based on the ant colony optimization (ACO) algorithm. The color, shape, and texture features, which are called handcrafted features, were extracted by four traditional feature extraction algorithms. Because of the similarity in the biological characteristics of early-stage malignant lymphomas, the features of the fused MobileNet-VGG16, VGG16-AlexNet, and MobileNet-AlexNet models were combined with the handcrafted features and classified by the XGBoost and DT algorithms based on the ACO algorithm. We concluded that the performance of the two networks XGBoost and DT, with fused features between DL networks and handcrafted, achieved the best performance. The XGBoost network based on the fused features of MobileNet-VGG16 and handcrafted features resulted in an AUC of 99.43%, accuracy of 99.8%, precision of 99.77%, sensitivity of 99.7%, and specificity of 99.8%. This highlights the significant role of AI in the early diagnosis of malignant lymphoma, offering improved accuracy, expedited diagnosis, and enhanced risk stratification. This study highlights leveraging AI techniques and biomedical image processing; the analysis of whole slide images (WSI) converted from biopsies allows for improved accuracy, faster diagnosis, and risk stratification. The developed strategies based on hybrid systems, combining deep learning networks, XGBoost and decision tree algorithms, demonstrated promising results in diagnosing malignant lymphoma images. Furthermore, the fusion of handcrafted features with features extracted from DL networks enhanced the performance of the classification models. Full article

(This article belongs to the Special Issue Artificial Intelligence in Computational Pathology)

► Show Figures

Figure 1

21 pages, 9047 KB

Open AccessArticle

Study on a Low-Illumination Enhancement Method for Online Monitoring Images Considering Multiple-Exposure Image Sequence Fusion

by Wenlong Zhao, Chengwei Jiang, Yunzhu An, Xiaopeng Yan and Chaofeng Dai

Electronics 2023, 12(12), 2654; https://doi.org/10.3390/electronics12122654 - 13 Jun 2023

Cited by 2 | Viewed by 2044

Abstract

In order to improve the problem of low image quality caused by insufficient illumination, a low-light image enhancement method with robustness is proposed, which can effectively handle extremely dark images while achieving good results for scenes with insufficient local illumination. First, we expose [...] Read more.

In order to improve the problem of low image quality caused by insufficient illumination, a low-light image enhancement method with robustness is proposed, which can effectively handle extremely dark images while achieving good results for scenes with insufficient local illumination. First, we expose the images to different degrees to form a multi-exposure image sequence; then, we introduce global-based luminance weights and contrast-based gradient weights to fuse the multi-exposure image sequence; finally, we use a bootstrap filter to suppress the noise that may occur during the image processing. We employ pertinent assessment criteria, such as the Peak Signal to Noise Ratio (PSNR), Structural Similarity (SSIM), the Average Gradient (AG), and the Figure Definition (FD), to assess how well the method enhances. Experimental results show that PSNR (31.32) and SSIM (0.74) are the highest in pretty dark scenes compared to most conventional algorithms such as MF, BIMEF, LECARM, etc. Similarly, in processing uneven illumination such as “moonlit night” images, the AG (10.21) and the FD (14.54) are at maximum. In addition, other evaluation metrics such as Shannon (SH) are optimal in the above scenarios. In addition, we apply the algorithm in this paper to the online monitoring images of electric power equipment, which can improve the image lightness while recovering the detail information. The algorithm has strong robustness in extremely dark images and natural low-light images, and the enhanced images have minimal distortion and best appearance in different low-light scenes. Full article

(This article belongs to the Special Issue Recent Advances in Computer Vision: Technologies and Applications)

► Show Figures

Figure 1

21 pages, 13135 KB

Open AccessArticle

Multi-Task Learning Approach Using Dynamic Hyperparameter for Multi-Exposure Fusion

by Chan-Gi Im, Dong-Min Son, Hyuk-Ju Kwon and Sung-Hak Lee

Mathematics 2023, 11(7), 1620; https://doi.org/10.3390/math11071620 - 27 Mar 2023

Cited by 1 | Viewed by 2358

Abstract

High-dynamic-range (HDR) image synthesis is a technology developed to accurately reproduce the actual scene of an image on a display by extending the dynamic range of an image. Multi-exposure fusion (MEF) technology, which synthesizes multiple low-dynamic-range (LDR) images to create an HDR image, [...] Read more.

High-dynamic-range (HDR) image synthesis is a technology developed to accurately reproduce the actual scene of an image on a display by extending the dynamic range of an image. Multi-exposure fusion (MEF) technology, which synthesizes multiple low-dynamic-range (LDR) images to create an HDR image, has been developed in various ways including pixel-based, patch-based, and deep learning-based methods. Recently, methods to improve the synthesis quality of images using deep-learning-based algorithms have mainly been studied in the field of MEF. Despite the various advantages of deep learning, deep-learning-based methods have a problem in that numerous multi-exposed and ground-truth images are required for training. In this study, we propose a self-supervised learning method that generates and learns reference images based on input images during the training process. In addition, we propose a method to train a deep learning model for an MEF with multiple tasks using dynamic hyperparameters on the loss functions. It enables effective network optimization across multiple tasks and high-quality image synthesis while preserving a simple network architecture. Our learning method applied to the deep learning model shows superior synthesis results compared to other existing deep-learning-based image synthesis algorithms. Full article

(This article belongs to the Special Issue Application of Machine Learning in Image Processing and Computer Vision)

► Show Figures

Figure 1

Search Results (32)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (32)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI