MDPI - Publisher of Open Access Journals

19 pages, 3294 KiB

Open AccessArticle

Rotation- and Scale-Invariant Object Detection Using Compressed 2D Voting with Sparse Point-Pair Screening

by Chenbo Shi, Yue Yu, Gongwei Zhang, Shaojia Yan, Changsheng Zhu, Yanhong Cheng and Chun Zhang

Electronics 2025, 14(15), 3046; https://doi.org/10.3390/electronics14153046 - 30 Jul 2025

Abstract

The Generalized Hough Transform (GHT) is a powerful method for rigid shape detection under rotation, scaling, translation, and partial occlusion conditions, but its four-dimensional accumulator incurs prohibitive computational and memory demands that prevent real-time deployment. To address this, we propose a framework that [...] Read more.

The Generalized Hough Transform (GHT) is a powerful method for rigid shape detection under rotation, scaling, translation, and partial occlusion conditions, but its four-dimensional accumulator incurs prohibitive computational and memory demands that prevent real-time deployment. To address this, we propose a framework that compresses the 4-D search space into a concise 2-D voting scheme by combining two-level sparse point-pair screening with an accelerated lookup. In the offline stage, template edges are extracted using an adaptive Canny operator with Otsu-determined thresholds, and gradient-direction differences for all point pairs are quantized to retain only those in the dominant bin, yielding rotation- and scale-invariant descriptors that populate a compact 2-D reference table. During the online stage, an adaptive grid selects only the highest-gradient pixels per cell as a base points, while a precomputed gradient-direction bucket table enables constant-time retrieval of compatible subpoints. Each valid base–subpoint pair is mapped to indices in the lookup table, and “fuzzy” votes are cast over a 3 × 3 neighborhood in the 2-D accumulator, whose global peak determines the object center. Evaluation on 200 real industrial parts—augmented to 1000 samples with noise, blur, occlusion, and nonlinear illumination—demonstrates that our method maintains over 90% localization accuracy, matches the classical GHT, and achieves a ten-fold speedup, outperforming IGHT and LI-GHT variants by 2–3×, thereby delivering a robust, real-time solution for industrial rigid object localization. Full article

► Show Figures

Figure 1

14 pages, 16727 KiB

Open AccessArticle

Well Begun Is Half Done: The Impact of Pre-Processing in MALDI Mass Spectrometry Imaging Analysis Applied to a Case Study of Thyroid Nodules

by Giulia Capitoli, Kirsten C. J. van Abeelen, Isabella Piga, Vincenzo L’Imperio, Marco S. Nobile, Daniela Besozzi and Stefania Galimberti

Stats 2025, 8(3), 57; https://doi.org/10.3390/stats8030057 - 10 Jul 2025

Cited by 1 | Viewed by 226

Abstract

The discovery of proteomic biomarkers in cancer research can be effectively performed in situ by exploiting Matrix-Assisted Laser Desorption Ionization (MALDI) Mass Spectrometry Imaging (MSI). However, due to experimental limitations, the spectra extracted by MALDI-MSI can be noisy, so pre-processing steps are generally [...] Read more.

The discovery of proteomic biomarkers in cancer research can be effectively performed in situ by exploiting Matrix-Assisted Laser Desorption Ionization (MALDI) Mass Spectrometry Imaging (MSI). However, due to experimental limitations, the spectra extracted by MALDI-MSI can be noisy, so pre-processing steps are generally needed to reduce the instrumental and analytical variability. Thus far, the importance and the effect of standard pre-processing methods, as well as their combinations and parameter settings, have not been extensively investigated in proteomics applications. In this work, we present a systematic study of 15 combinations of pre-processing steps—including baseline, smoothing, normalization, and peak alignment—for a real-data classification task on MALDI-MSI data measured from fine-needle aspirates biopsies of thyroid nodules. The influence of each combination was assessed by analyzing the feature extraction, pixel-by-pixel classification probabilities, and LASSO classification performance. Our results highlight the necessity of fine-tuning a pre-processing pipeline, especially for the reliable transfer of molecular diagnostic signatures in clinical practice. We outline some recommendations on the selection of pre-processing steps, together with filter levels and alignment methods, according to the mass-to-charge range and heterogeneity of data. Full article

(This article belongs to the Section Applied Statistics and Machine Learning Methods)

► Show Figures

Graphical abstract

40 pages, 4919 KiB

Open AccessArticle

NGSTGAN: N-Gram Swin Transformer and Multi-Attention U-Net Discriminator for Efficient Multi-Spectral Remote Sensing Image Super-Resolution

by Chao Zhan, Chunyang Wang, Bibo Lu, Wei Yang, Xian Zhang and Gaige Wang

Remote Sens. 2025, 17(12), 2079; https://doi.org/10.3390/rs17122079 - 17 Jun 2025

Viewed by 532

Abstract

The reconstruction of high-resolution (HR) remote sensing images (RSIs) from low-resolution (LR) counterparts is a critical task in remote sensing image super-resolution (RSISR). Recent advancements in convolutional neural networks (CNNs) and Transformers have significantly improved RSISR performance due to their capabilities in local [...] Read more.

The reconstruction of high-resolution (HR) remote sensing images (RSIs) from low-resolution (LR) counterparts is a critical task in remote sensing image super-resolution (RSISR). Recent advancements in convolutional neural networks (CNNs) and Transformers have significantly improved RSISR performance due to their capabilities in local feature extraction and global modeling. However, several limitations remain, including the underutilization of multi-scale features in RSIs, the limited receptive field of Swin Transformer’s window self-attention (WSA), and the computational complexity of existing methods. To address these issues, this paper introduces the NGSTGAN model, which employs an N-Gram Swin Transformer as the generator and a multi-attention U-Net as the discriminator. The discriminator enhances attention to multi-scale key features through the addition of channel, spatial, and pixel attention (CSPA) modules, while the generator utilizes an improved shallow feature extraction (ISFE) module to extract multi-scale and multi-directional features, enhancing the capture of complex textures and details. The N-Gram concept is introduced to expand the receptive field of Swin Transformer, and sliding window self-attention (S-WSA) is employed to facilitate interaction between neighboring windows. Additionally, channel-reducing group convolution (CRGC) is used to reduce the number of parameters and computational complexity. A cross-sensor multispectral dataset combining Landsat-8 (L8) and Sentinel-2 (S2) is constructed for the resolution enhancement of L8’s blue (B), green (G), red (R), and near-infrared (NIR) bands from 30 m to 10 m. Experiments show that NGSTGAN outperforms the state-of-the-art (SOTA) method, achieving improvements of 0.5180 dB in the peak signal-to-noise ratio (PSNR) and 0.0153 in the structural similarity index measure (SSIM) over the second best method, offering a more effective solution to the task. Full article

(This article belongs to the Special Issue 3D Information Recovery and 2D Image Processing for Remotely Sensed Optical Images (Third Edition))

► Show Figures

Figure 1

10 pages, 3266 KiB

Open AccessArticle

Extended Shortwave Infrared T2SL Detector Based on AlAsSb/GaSb Barrier Optimization

by Jing Yu, Yuegang Fu, Lidan Lu, Weiqiang Chen, Jianzhen Ou and Lianqing Zhu

Micromachines 2025, 16(5), 575; https://doi.org/10.3390/mi16050575 - 14 May 2025

Viewed by 497

Abstract

Extended shortwave infrared (eSWIR) detectors operating at high temperatures are widely utilized in planetary science. A high-performance eSWIR based on pBin InAs/GaSb/AlSb type-II superlattice (T2SL) grown on a GaSb substrate is demonstrated. It achieves the optimization of the device’s optoelectronic performance by adjusting [...] Read more.

Extended shortwave infrared (eSWIR) detectors operating at high temperatures are widely utilized in planetary science. A high-performance eSWIR based on pBin InAs/GaSb/AlSb type-II superlattice (T2SL) grown on a GaSb substrate is demonstrated. It achieves the optimization of the device’s optoelectronic performance by adjusting the p-type doping concentration in the AlAs_0.1Sb_0.9/GaSb barrier. Experimental and TCAD simulation results demonstrate that both the device’s dark current and responsivity grow as the doping concentration rises. Here, the bulk dark current density and bulk differential resistance area are extracted to calculate the bulk detectivity for evaluating the photoelectric performance of the device. When the barrier concentration is 5 × 10¹⁶ cm⁻³, the bulk detectivity is 2.1 × 10¹¹ cm·Hz^1/2/W, which is 256% higher than the concentration of 1.5 × 10¹⁸ cm⁻³. Moreover, at 300 K (−10 mV), the 100% cutoff wavelength of the device is 1.9 μm, the dark current density is 9.48 × 10⁻⁶ A/cm², and the peak specific detectivity is 7.59 × 10¹⁰ cm·Hz^1/2/W (at 1.6 μm). An eSWIR focal plane array (FPA) detector with a 320 × 256 array scale was fabricated for this purpose. It demonstrates a remarkably low blind pixel rate of 0.02% and exhibits an excellent imaging quality at room temperature, indicating its vast potential for applications in infrared imaging. Full article

(This article belongs to the Special Issue Integrated Photonics and Optoelectronics, 2nd Edition)

► Show Figures

Figure 1

33 pages, 20540 KiB

Open AccessArticle

SG-ResNet: Spatially Adaptive Gabor Residual Networks with Density-Peak Guidance for Joint Image Steganalysis and Payload Location

by Zhengliang Lai, Chenyi Wu, Xishun Zhu, Jianhua Wu and Guiqin Duan

Mathematics 2025, 13(9), 1460; https://doi.org/10.3390/math13091460 - 29 Apr 2025

Viewed by 437

Abstract

Image steganalysis detects hidden information in digital images by identifying statistical anomalies, serving as a forensic tool to reveal potential covert communication. The field of deep learning-based image steganography has relatively scarce effective steganalysis methods, particularly those designed to extract hidden information. This [...] Read more.

Image steganalysis detects hidden information in digital images by identifying statistical anomalies, serving as a forensic tool to reveal potential covert communication. The field of deep learning-based image steganography has relatively scarce effective steganalysis methods, particularly those designed to extract hidden information. This paper introduces an innovative image steganalysis method based on generative adaptive Gabor residual networks with density-peak guidance (SG-ResNet). SG-ResNet employs a dual-stream collaborative architecture to achieve precise detection and reconstruction of steganographic information. The classification subnet utilizes dual-frequency adaptive Gabor convolutional kernels to decouple high-frequency texture and low-frequency contour components in images. It combines a density peak clustering with three quantization and transformation-enhanced convolutional blocks to generate steganographic covariance matrices, enhancing the weak steganographic signals. The reconstruction subnet synchronously constructs multi-scale features, preserves steganographic spatial fingerprints with channel-separated residual spatial rich model and pixel reorganization operators, and achieves sub-pixel-level steganographic localization via iterative optimization mechanism of feedback residual modules. Experimental results obtained with datasets generated by several public steganography algorithms demonstrate that SG-ResNet achieves State-of-the-Art results in terms of detection accuracy, with 0.94, and with a PSNR of 29 between reconstructed and original secret images. Full article

(This article belongs to the Special Issue New Solutions for Multimedia and Artificial Intelligence Security)

► Show Figures

Figure 1

19 pages, 2806 KiB

Open AccessArticle

SP-IGAN: An Improved GAN Framework for Effective Utilization of Semantic Priors in Real-World Image Super-Resolution

by Meng Wang, Zhengnan Li, Haipeng Liu, Zhaoyu Chen and Kewei Cai

Entropy 2025, 27(4), 414; https://doi.org/10.3390/e27040414 - 11 Apr 2025

Cited by 1 | Viewed by 548

Abstract

Single-image super-resolution (SISR) based on GANs has achieved significant progress. However, these methods still face challenges when reconstructing locally consistent textures due to a lack of semantic understanding of image categories. This highlights the necessity of focusing on contextual information comprehension and the [...] Read more.

Single-image super-resolution (SISR) based on GANs has achieved significant progress. However, these methods still face challenges when reconstructing locally consistent textures due to a lack of semantic understanding of image categories. This highlights the necessity of focusing on contextual information comprehension and the acquisition of high-frequency details in model design. To address this issue, we propose the Semantic Prior-Improved GAN (SP-IGAN) framework, which incorporates additional contextual semantic information into the Real-ESRGAN model. The framework consists of two branches. The main branch introduces a Graph Convolutional Channel Attention (GCCA) module to transform channel dependencies into adjacency relationships between feature vertices, thereby enhancing pixel associations. The auxiliary branch strengthens the correlation between semantic category information and regional textures in the Residual-in-Residual Dense Block (RRDB) module. The auxiliary branch employs a pretrained segmentation model to accurately extract regional semantic information from the input low-resolution image. This information is injected into the RRDB module through Spatial Feature Transform (SFT) layers, generating more accurate and semantically consistent texture details. Additionally, a wavelet loss is incorporated into the loss function to capture high-frequency details that are often overlooked. The experimental results demonstrate that the proposed SP-IGAN outperforms state-of-the-art (SOTA) super-resolution models across multiple public datasets. For the X4 super-resolution task, SP-IGAN achieves a 0.55 dB improvement in Peak Signal-to-Noise Ratio (PSNR) and a 0.0363 increase in Structural Similarity Index (SSIM) compared to the baseline model Real-ESRGAN. Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing II)

► Show Figures

Figure 1

29 pages, 16314 KiB

Open AccessArticle

A Novel Framework for Real ICMOS Image Denoising: LD-NGN Noise Modeling and a MAST-Net Denoising Network

by Yifu Luo, Ting Zhang, Ruizhi Li, Bin Zhang, Nan Jia and Liping Fu

Remote Sens. 2025, 17(7), 1219; https://doi.org/10.3390/rs17071219 - 29 Mar 2025

Viewed by 522

Abstract

Intensified complementary metal-oxide semiconductor (ICMOS) sensors involve multiple steps, including photoelectric conversion and photoelectric multiplication, each of which introduces noise that significantly impacts image quality. To address the issues of insufficient denoising performance and poor model generalization in ICMOS image denoising, this paper [...] Read more.

Intensified complementary metal-oxide semiconductor (ICMOS) sensors involve multiple steps, including photoelectric conversion and photoelectric multiplication, each of which introduces noise that significantly impacts image quality. To address the issues of insufficient denoising performance and poor model generalization in ICMOS image denoising, this paper proposes a systematic solution. First, we established an experimental platform to collect real ICMOS images and introduced a novel noise generation network (LD-NGN) that accurately simulates the strong sparsity and spatial clustering of ICMOS noise, generating a multi-scene paired dataset. Additionally, we proposed a new noise evaluation metric, KL-Noise, which allows a more precise quantification of noise distribution. Based on this, we designed a denoising network specifically for ICMOS images, MAST-Net, and trained it using the multi-scene paired dataset generated by LD-NGN. By capturing multi-scale features of image pixels, MAST-Net effectively removes complex noise. The experimental results show that, compared to traditional methods and denoisers trained with other noise generators, our method outperforms both qualitatively and quantitatively. The denoised images achieve a peak signal-to-noise ratio (PSNR) of 35.38 dB and a structural similarity index (SSIM) of 0.93. This optimization provides support for tasks such as image preprocessing, target recognition, and feature extraction. Full article

► Show Figures

Graphical abstract

14 pages, 1607 KiB

Open AccessArticle

Global NDVI-LST Correlation: Temporal and Spatial Patterns from 2000 to 2024

by Ehsan Rahimi, Pinliang Dong and Chuleui Jung

Environments 2025, 12(2), 67; https://doi.org/10.3390/environments12020067 - 17 Feb 2025

Cited by 5 | Viewed by 3165

Abstract

While numerous studies have investigated the NDVI-LST relationship at local or regional scales, existing global analyses are outdated and fail to incorporate recent environmental changes driven by climate change and human activity. This study aims to address this gap by conducting an extensive [...] Read more.

While numerous studies have investigated the NDVI-LST relationship at local or regional scales, existing global analyses are outdated and fail to incorporate recent environmental changes driven by climate change and human activity. This study aims to address this gap by conducting an extensive global analysis of NDVI-LST correlations from 2000 to 2024, utilizing multi-source satellite data to assess latitudinal and ecosystem-specific variability. The MODIS dataset, which provides global daily LST data at a 1 km resolution from 2000 to 2024, was used alongside MODIS-derived NDVI data, which offers global vegetation indices at a 1 km resolution and 16-day temporal intervals. A correlation analysis was performed by extracting NDVI and LST values for each raster cell. The analysis revealed significant negative correlations in regions such as the western United States, Brazil, southern Africa, and northern Australia, where increased temperatures suppress vegetation activity. A total of 38,281,647 pixels, or 20% of the global map, exhibited statistically significant correlations, with 80.4% showing negative correlations, indicating a reduction in vegetation activity as temperatures rise. The latitudinal distribution of significant correlations revealed two prominent peaks: one in the tropical and subtropical regions of the Southern Hemisphere and another in the temperate zones of the Northern Hemisphere. This study uncovers notable spatial and latitudinal patterns in the LST-NDVI relationship, with most regions exhibiting negative correlations, underscoring the cooling effects of vegetation. These findings emphasize the crucial role of vegetation in regulating surface temperatures, providing valuable insights into ecosystem health, and informing conservation strategies in response to climate change. Full article

(This article belongs to the Topic Past, Current and Future Processes in the Earth Critical Zone)

► Show Figures

Figure 1

19 pages, 19857 KiB

Open AccessArticle

A Plug Seedling Growth-Point Detection Method Based on Differential Evolution Extra-Green Algorithm

by Hongmei Xia, Shicheng Zhu, Teng Yang, Runxin Huang, Jianhua Ou, Lingjin Dong, Dewen Tao and Wenbin Zhen

Agronomy 2025, 15(2), 375; https://doi.org/10.3390/agronomy15020375 - 31 Jan 2025

Viewed by 680

Abstract

To produce plug seedlings with uniform growth and which are suitable for high-speed transplanting operations, it is essential to sow seeds precisely at the center of each plug-tray hole. For accurately determining the position of the seed covered by the substrate within individual [...] Read more.

To produce plug seedlings with uniform growth and which are suitable for high-speed transplanting operations, it is essential to sow seeds precisely at the center of each plug-tray hole. For accurately determining the position of the seed covered by the substrate within individual plug-tray holes, a novel method for detecting the growth points of plug seedlings has been proposed. It employs an adaptive grayscale processing algorithm based on the differential evolution extra-green algorithm to extract the contour features of seedlings during the early stages of cotyledon emergence. The pixel overlay curve peak points within the binary image of the plug-tray’s background are utilized to delineate the boundaries of the plug-tray holes. Each plug-tray hole containing a single seedling is identified by analyzing the area and perimeter of the seedling’s contour connectivity domains. The midpoint of the shortest line between these domains is designated as the growth point of the individual seedling. For laboratory-grown plug seedlings of tomato, pepper, and Chinese kale, the highest detection accuracy was achieved on the third-, fourth-, and second-days’ post-cotyledon emergence, respectively. The identification rate of missing seedlings and single seedlings exceeded 97.57% and 99.25%, respectively, with a growth-point detection error of less than 0.98 mm. For tomato and broccoli plug seedlings cultivated in a nursery greenhouse three days after cotyledon emergence, the detection accuracy for missing seedlings and single seedlings was greater than 95.78%, with a growth-point detection error of less than 2.06 mm. These results validated the high detection accuracy and broad applicability of the proposed method for various seedling types at the appropriate growth stages. Full article

(This article belongs to the Special Issue Harnessing Sensing, Artificial Intelligence, and Robotics for Digital Agriculture)

► Show Figures

Figure 1

21 pages, 24174 KiB

Open AccessArticle

Precision Denoising in Medical Imaging via Generative Adversarial Network-Aided Low-Noise Discriminator Technique

by Turki M. Alanazi and Paolo Mercorelli

Mathematics 2024, 12(23), 3705; https://doi.org/10.3390/math12233705 - 26 Nov 2024

Cited by 1 | Viewed by 1413

Abstract

Medical imaging is significant for accurate diagnosis, and here, noise often degrades image quality, thus making it challenging to identify important information. Denoising is a component of traditional image pre-processing that helps prevent incorrect disease diagnosis. Mitigating the noise becomes difficult if there [...] Read more.

Medical imaging is significant for accurate diagnosis, and here, noise often degrades image quality, thus making it challenging to identify important information. Denoising is a component of traditional image pre-processing that helps prevent incorrect disease diagnosis. Mitigating the noise becomes difficult if there are differences in the low-level segment features. Therefore, a Generative Adversarial Network (GAN)-aided Low-Noise Discriminator (LND) is introduced to improve the denoising effectiveness in medical images with a balanced image resolution with noise mitigation. The LND function is a key that distinguishes between high- and low-noise areas based on segmented features, which are also achieved by tuning the peak signal-to-noise ratio (PSNR). Considering the training sequences, the LND-identified intervals lessen the sequences to improve the changes in pixel reconstruction. The generator function in this method is responsible for increasing the PSNR improvements over the different pixels cumulatively. The proposed method successfully improves the pixel reconstruction by 11.05% and PSNR by 9.75%, with 9.75% less reconstruction time and 13.11% less extraction error for the higher pixel distribution ratios than other contemporary methods. Full article

(This article belongs to the Special Issue Deep Neural Networks: Theory, Algorithms and Applications)

► Show Figures

Figure 1

13 pages, 721 KiB

Open AccessEditor’s ChoiceArticle

Comparison of On-Sky Wavelength Calibration Methods for Integral Field Spectrograph

by Jie Song, Baichuan Ren, Yuyu Tang, Jun Wei and Xiaoxian Huang

Electronics 2024, 13(20), 4131; https://doi.org/10.3390/electronics13204131 - 21 Oct 2024

Cited by 1 | Viewed by 953

Abstract

With advancements in technology, scientists are delving deeper in their explorations of the universe. Integral field spectrograph (IFS) play a significant role in investigating the physical properties of supermassive black holes at the centers of galaxies, the nuclei of galaxies, and the star [...] Read more.

With advancements in technology, scientists are delving deeper in their explorations of the universe. Integral field spectrograph (IFS) play a significant role in investigating the physical properties of supermassive black holes at the centers of galaxies, the nuclei of galaxies, and the star formation processes within galaxies, including under extreme conditions such as those present in galaxy mergers, ultra-low-metallicity galaxies, and star-forming galaxies with strong feedback. IFS transform the spatial field into a linear field using an image slicer and obtain the spectra of targets in each spatial resolution element through a grating. Through scientific processing, two-dimensional images for each target band can be obtained. IFS use concave gratings as dispersion systems to decompose the polychromatic light emitted by celestial bodies into monochromatic light, arranged linearly according to wavelength. In this experiment, the working environment of a star was simulated in the laboratory to facilitate the wavelength calibration of the space integral field spectrometer. Tools necessary for the calibration process were also explored. A mercury–argon lamp was employed as the light source to extract characteristic information from each pixel in the detector, facilitating the wavelength calibration of the spatial IFS. The optimal peak-finding method was selected by contrasting the center of weight, polynomial fitting, and Gaussian fitting methods. Ultimately, employing the 4FFT-LMG algorithm to fit Gaussian curves enabled the determination of the spectral peak positions, yielding wavelength calibration coefficients for a spatial IFS within the range of 360 nm to 600 nm. The correlation of the fitting results between the detector pixel positions and corresponding wavelengths was >99.99%. The calibration accuracy during wavelength calibration was 0.0067 nm, reaching a very high level. Full article

(This article belongs to the Section Circuit and Signal Processing)

► Show Figures

Figure 1

21 pages, 7988 KiB

Open AccessArticle

SAR Image Despeckling Based on Denoising Diffusion Probabilistic Model and Swin Transformer

by Yucheng Pan, Liheng Zhong, Jingdong Chen, Heping Li, Xianlong Zhang and Bin Pan

Remote Sens. 2024, 16(17), 3222; https://doi.org/10.3390/rs16173222 - 30 Aug 2024

Cited by 3 | Viewed by 2991

Abstract

The speckle noise inherent in synthetic aperture radar (SAR) imaging has long posed a challenge for SAR data processing, significantly affecting image interpretation and recognition. Recently, deep learning-based SAR speckle removal algorithms have shown promising results. However, most existing algorithms rely on convolutional [...] Read more.

The speckle noise inherent in synthetic aperture radar (SAR) imaging has long posed a challenge for SAR data processing, significantly affecting image interpretation and recognition. Recently, deep learning-based SAR speckle removal algorithms have shown promising results. However, most existing algorithms rely on convolutional neural networks (CNN), which may struggle to effectively capture global image information and lead to texture loss. Besides, due to the different characteristics of optical images and synthetic aperture radar (SAR) images, the results of training with simulated SAR data may bring instability to the real-world SAR data denoising. To address these limitations, we propose an innovative approach that integrates swin transformer blocks into the prediction noise network of the denoising diffusion probabilistic model (DDPM). By harnessing DDPM’s robust generative capabilities and the Swin Transformer’s proficiency in extracting global features, our approach aims to suppress speckle while preserving image details and enhancing authenticity. Additionally, we employ a post-processing strategy known as pixel-shuffle down-sampling (PD) refinement to mitigate the adverse effects of training data and the training process, which rely on spatially uncorrelated noise, thereby improving its adaptability to real-world SAR image despeckling scenarios. We conducted experiments using both simulated SAR image datasets and real SAR image datasets, evaluating our algorithm from subjective and objective perspectives. The visual results demonstrate significant improvements in noise suppression and image detail restoration. The objective results demonstrate that our method obtains state-of-the-art performance, which outperforms the second-best method by an average peak signal-to-noise ratio (PSNR) of 0.93 dB and Structural Similarity Index (SSIM) of 0.03, affirming the effectiveness of our approach. Full article

► Show Figures

Figure 1

19 pages, 5134 KiB

Open AccessArticle

Attribute Feature Perturbation-Based Augmentation of SAR Target Data

by Rubo Jin, Jianda Cheng, Wei Wang, Huiqiang Zhang and Jun Zhang

Sensors 2024, 24(15), 5006; https://doi.org/10.3390/s24155006 - 2 Aug 2024

Cited by 1 | Viewed by 1169

Abstract

Large-scale, diverse, and high-quality data are the basis and key to achieving a good generalization of target detection and recognition algorithms based on deep learning. However, the existing methods for the intelligent augmentation of synthetic aperture radar (SAR) images are confronted with several [...] Read more.

Large-scale, diverse, and high-quality data are the basis and key to achieving a good generalization of target detection and recognition algorithms based on deep learning. However, the existing methods for the intelligent augmentation of synthetic aperture radar (SAR) images are confronted with several issues, including training instability, inferior image quality, lack of physical interpretability, etc. To solve the above problems, this paper proposes a feature-level SAR target-data augmentation method. First, an enhanced capsule neural network (CapsNet) is proposed and employed for feature extraction, decoupling the attribute information of input data. Moreover, an attention mechanism-based attribute decoupling framework is used, which is beneficial for achieving a more effective representation of features. After that, the decoupled attribute feature, including amplitude, elevation angle, azimuth angle, and shape, can be perturbed to increase the diversity of features. On this basis, the augmentation of SAR target images is realized by reconstructing the perturbed features. In contrast to the augmentation methods using random noise as input, the proposed method realizes the mapping from the input of known distribution to the change in unknown distribution. This mapping method reduces the correlation distance between the input signal and the augmented data, therefore diminishing the demand for training data. In addition, we combine pixel loss and perceptual loss in the reconstruction process, which improves the quality of the augmented SAR data. The evaluation of the real and augmented images is conducted using four assessment metrics. The images generated by this method achieve a peak signal-to-noise ratio (PSNR) of 21.6845, radiometric resolution (RL) of 3.7114, and dynamic range (DR) of 24.0654. The experimental results demonstrate the superior performance of the proposed method. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

18 pages, 5449 KiB

Open AccessArticle

Experimental Study to Visualize a Methane Leak of 0.25 mL/min by Direct Absorption Spectroscopy and Mid-Infrared Imaging

by Thomas Strahl, Max Bergau, Eric Maier, Johannes Herbst, Sven Rademacher, Jürgen Wöllenstein and Katrin Schmitt

Appl. Sci. 2024, 14(14), 5988; https://doi.org/10.3390/app14145988 - 9 Jul 2024

Cited by 1 | Viewed by 4078

Abstract

Tunable laser spectroscopy (TLS) with infrared (IR) imaging is a powerful tool for gas leak detection. This study focuses on direct absorption spectroscopy (DAS) that utilizes wavelength modulation to extract gas information. A tunable interband cascade laser (ICL) with an optical power of [...] Read more.

Tunable laser spectroscopy (TLS) with infrared (IR) imaging is a powerful tool for gas leak detection. This study focuses on direct absorption spectroscopy (DAS) that utilizes wavelength modulation to extract gas information. A tunable interband cascade laser (ICL) with an optical power of 5 mW is periodically modulated by a sawtooth injection current at 10 Hz across the methane absorption around 3271 nm. A fast and sensitive thermal imaging camera for the mid-infrared range between 3 and 5.7 µm is operated at a frame rate of 470 Hz. Offline processing of image stacks is performed using different algorithms (DAS-F, DAS-f and DAS-2f) based on the Lambert–Beer law and the HITRAN database. These algorithms analyze various features of gas absorption, such as area (F), peak (f) and second derivative (2f) of the absorbance. The methane concentration in ppm*m is determined on a pixel-by-pixel analysis without calibration. Leak localization for methane leak rates as low as 0.25 mL/min is accurately displayed in a single concentration image with pixelwise sensitivities of approximately 1 ppm*m in a laboratory environment. Concentration image sequences represent the spatiotemporal dynamics of a gas plume with high contrast. The DAS-2f concept demonstrates promising characteristics, including accuracy, precision, 1/f noise rejection, simplicity and computational efficiency, expanding the applications of DAS. Full article

(This article belongs to the Special Issue Novel Laser-Based Spectroscopic Techniques and Applications)

► Show Figures

Figure 1

23 pages, 8332 KiB

Open AccessArticle

A Novel Two-Stage Approach for Automatic Extraction and Multi-View Generation of Litchis

by Yuanhong Li, Jing Wang, Ming Liang, Haoyu Song, Jianhong Liao and Yubin Lan

Agriculture 2024, 14(7), 1046; https://doi.org/10.3390/agriculture14071046 - 29 Jun 2024

Cited by 1 | Viewed by 1394

Abstract

Obtaining consistent multi-view images of litchis is crucial for various litchi-related studies, such as data augmentation and 3D reconstruction. This paper proposes a two-stage model that integrates the Mask2Former semantic segmentation network with the Wonder3D multi-view generation network. This integration aims to accurately [...] Read more.

Obtaining consistent multi-view images of litchis is crucial for various litchi-related studies, such as data augmentation and 3D reconstruction. This paper proposes a two-stage model that integrates the Mask2Former semantic segmentation network with the Wonder3D multi-view generation network. This integration aims to accurately segment and extract litchis from complex backgrounds and generate consistent multi-view images of previously unseen litchis. In the first stage, the Mask2Former model is utilized to predict litchi masks, enabling the extraction of litchis from complex backgrounds. To further enhance the accuracy of litchi branch extraction, we propose a novel method that combines the predicted masks with morphological operations and the HSV color space. This approach ensures accurate extraction of litchi branches even when the semantic segmentation model’s prediction accuracy is not high. In the second stage, the segmented and extracted litchi images are passed as input into the Wonder3D network to generate multi-view of the litchis. After comparing different semantic segmentation and multi-view synthesis networks, the Mask2Former and Wonder3D networks demonstrated the best performance. The Mask2Former network achieved a mean Intersection over Union (mIoU) of 79.79% and a mean pixel accuracy (mPA) of 85.82%. The Wonder3D network achieved a peak signal-to-noise ratio (PSNR) of 18.89 dB, a structural similarity index (SSIM) of 0.8199, and a learned perceptual image patch similarity (LPIPS) of 0.114. Combining the Mask2Former model with the Wonder3D network resulted in an increase in PSNR and SSIM scores by 0.21 dB and 0.0121, respectively, and a decrease in LPIPS by 0.064 compared to using the Wonder3D model alone. Therefore, the proposed two-stage model effectively achieves automatic extraction and multi-view generation of litchis with high accuracy. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

Search Results (76)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (76)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI