MDPI - Publisher of Open Access Journals

14 pages, 1438 KiB

Open AccessArticle

CDBA-GAN: A Conditional Dual-Branch Attention Generative Adversarial Network for Robust Sonar Image Generation

by Wanzeng Kong, Han Yang, Mingyang Jia and Zhe Chen

Appl. Sci. 2025, 15(13), 7212; https://doi.org/10.3390/app15137212 - 26 Jun 2025

Viewed by 296

The acquisition of real-world sonar data necessitates substantial investments of manpower, material resources, and financial capital, rendering it challenging to obtain sufficient authentic samples for sonar-related research tasks. Consequently, sonar image simulation technology has become increasingly vital in the field of sonar data [...] Read more.

The acquisition of real-world sonar data necessitates substantial investments of manpower, material resources, and financial capital, rendering it challenging to obtain sufficient authentic samples for sonar-related research tasks. Consequently, sonar image simulation technology has become increasingly vital in the field of sonar data analysis. Traditional sonar simulation methods predominantly focus on low-level physical modeling, which often suffers from limited image controllability and diminished fidelity in multi-category and multi-background scenarios. To address these limitations, this paper proposes a Conditional Dual-Branch Attention Generative Adversarial Network (CDBA-GAN). The framework comprises three key innovations: The conditional information fusion module, dual-branch attention feature fusion mechanism, and cross-layer feature reuse. By integrating encoded conditional information with the original input data of the generative adversarial network, the fusion module enables precise control over the generation of sonar images under specific conditions. A hierarchical attention mechanism is implemented, sequentially performing channel-level and pixel-level attention operations. This establishes distinct weight matrices at both granularities, thereby enhancing the correlation between corresponding elements. The dual-branch attention features are fused via a skip-connection architecture, facilitating efficient feature reuse across network layers. The experimental results demonstrate that the proposed CDBA-GAN generates condition-specific sonar images with a significantly lower Fréchet inception distance (FID) compared to existing methods. Notably, the framework exhibits robust imaging performance under noisy interference and outperforms state-of-the-art models (e.g., DCGAN, WGAN, SAGAN) in fidelity across four categorical conditions, as quantified by FID metrics. Full article

► Show Figures

Figure 1

19 pages, 23096 KiB

Open AccessArticle

GAN-Based Super-Resolution in Linear R-SAM Imaging for Enhanced Non-Destructive Semiconductor Measurement

by Thi Thu Ha Vu, Tan Hung Vo, Trong Nhan Nguyen, Jaeyeop Choi, Le Hai Tran, Vu Hoang Minh Doan, Van Bang Nguyen, Wonjo Lee, Sudip Mondal and Junghwan Oh

Appl. Sci. 2025, 15(12), 6780; https://doi.org/10.3390/app15126780 - 17 Jun 2025

Viewed by 484

Abstract

The precise identification and non-destructive measurement of structural features and defects in semiconductor wafers are essential for ensuring process integrity and sustaining high yield in advanced manufacturing environments. Unlike conventional measurement techniques, scanning acoustic microscopy (SAM) is an advanced method that provides detailed [...] Read more.

The precise identification and non-destructive measurement of structural features and defects in semiconductor wafers are essential for ensuring process integrity and sustaining high yield in advanced manufacturing environments. Unlike conventional measurement techniques, scanning acoustic microscopy (SAM) is an advanced method that provides detailed visualizations of both surface and internal wafer structures. However, in practical industrial applications, the scanning time and image quality of SAM significantly impact its overall performance and utility. Prolonged scanning durations can lead to production bottlenecks, while suboptimal image quality can compromise the accuracy of defect detection. To address these challenges, this study proposes LinearTGAN, an improved generative adversarial network (GAN)-based model specifically designed to improve the resolution of linear acoustic wafer images acquired by the breakthrough rotary scanning acoustic microscopy (R-SAM) system. Empirical evaluations demonstrate that the proposed model significantly outperforms conventional GAN-based approaches, achieving a Peak Signal-to-Noise Ratio (PSNR) of 29.479 dB, a Structural Similarity Index Measure (SSIM) of 0.874, a Learned Perceptual Image Patch Similarity (LPIPS) of 0.095, and a Fréchet Inception Distance (FID) of 0.445. To assess the measurement aspect of LinearTGAN, a lightweight defect segmentation module was integrated and tested on annotated wafer datasets. The super-resolved images produced by LinearTGAN significantly enhanced segmentation accuracy and improved the sensitivity of microcrack detection. Furthermore, the deployment of LinearTGAN within the R-SAM system yielded a 92% improvement in scanning performance for 12-inch wafers while simultaneously enhancing image fidelity. The integration of super-resolution techniques into R-SAM significantly advances the precision, robustness, and efficiency of non-destructive measurements, highlighting their potential to have a transformative impact in semiconductor metrology and quality assurance. Full article

► Show Figures

Figure 1

11 pages, 3520 KiB

Open AccessArticle

Enhancing Atmospheric Turbulence Phase Screen Generation with an Improved Diffusion Model and U-Net Noise Generation Network

by Hangning Kou, Min Wan and Jingliang Gu

Photonics 2025, 12(4), 381; https://doi.org/10.3390/photonics12040381 - 15 Apr 2025

Viewed by 690

Abstract

Simulating atmospheric turbulence phase screens is essential for optical system research and turbulence compensation. Traditional methods, such as multi-harmonic power spectrum inversion and Zernike polynomial fitting, often suffer from sampling errors and limited diversity. To overcome these challenges, this paper proposes an improved [...] Read more.

Simulating atmospheric turbulence phase screens is essential for optical system research and turbulence compensation. Traditional methods, such as multi-harmonic power spectrum inversion and Zernike polynomial fitting, often suffer from sampling errors and limited diversity. To overcome these challenges, this paper proposes an improved denoising diffusion probabilistic model (DDPM) for generating high-fidelity atmospheric turbulence phase screens. The model effectively captures the statistical distribution of turbulence phase screens using small training datasets. A refined loss function incorporating the structure function enhances accuracy. Additionally, a self-attention module strengthens the model’s ability to learn phase screen features. The experimental results demonstrate that the proposed approach significantly reduces the Fréchet Inception Distance (FID) from 154.45 to 59.80, with the mean loss stabilizing around 0.1 after 50,000 iterations. The generated phase screens exhibit high precision and diversity, providing an efficient and adaptable solution for atmospheric turbulence simulation. Full article

(This article belongs to the Section Data-Science Based Techniques in Photonics)

► Show Figures

Figure 1

24 pages, 7057 KiB

Open AccessArticle

Construction and Enhancement of a Rural Road Instance Segmentation Dataset Based on an Improved StyleGAN2-ADA

by Zhixin Yao, Renna Xi, Taihong Zhang, Yunjie Zhao, Yongqiang Tian and Wenjing Hou

Sensors 2025, 25(8), 2477; https://doi.org/10.3390/s25082477 - 15 Apr 2025

Viewed by 431

Abstract

With the advancement of agricultural automation, the demand for road recognition and understanding in agricultural machinery autonomous driving systems has significantly increased. To address the scarcity of instance segmentation data for rural roads and rural unstructured scenes, particularly the lack of support for [...] Read more.

With the advancement of agricultural automation, the demand for road recognition and understanding in agricultural machinery autonomous driving systems has significantly increased. To address the scarcity of instance segmentation data for rural roads and rural unstructured scenes, particularly the lack of support for high-resolution and fine-grained classification, a 20-class instance segmentation dataset was constructed, comprising 10,062 independently annotated instances. An improved StyleGAN2-ADA data augmentation method was proposed to generate higher-quality image data. This method incorporates a decoupled mapping network (DMN) to reduce the coupling degree of latent codes in W-space and integrates the advantages of convolutional networks and transformers by designing a convolutional coupling transfer block (CCTB). The core cross-shaped window self-attention mechanism in the CCTB enhances the network’s ability to capture complex contextual information and spatial layouts. Ablation experiments comparing the improved and original StyleGAN2-ADA networks demonstrate significant improvements, with the inception score (IS) increasing from 42.38 to 77.31 and the Fréchet inception distance (FID) decreasing from 25.09 to 12.42, indicating a notable enhancement in data generation quality and authenticity. In order to verify the effect of data enhancement on the model performance, the algorithms Mask R-CNN, SOLOv2, YOLOv8n, and OneFormer were tested to compare the performance difference between the original dataset and the enhanced dataset, which further confirms the effectiveness of the improved module. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

21 pages, 6983 KiB

Open AccessArticle

OP-Gen: A High-Quality Remote Sensing Image Generation Algorithm Guided by OSM Images and Textual Prompts

by Huolin Xiong, Zekun Li, Qunbo Lv, Baoyu Zhu, Yu Zhang, Chaoyang Yu and Zheng Tan

Remote Sens. 2025, 17(7), 1226; https://doi.org/10.3390/rs17071226 - 30 Mar 2025

Viewed by 813

Abstract

The application of diffusion models in the field of remote sensing image generation has significantly improved the performance of generation algorithms. However, existing methods still exhibit certain limitations, such as the inability to generate images with rich texture details and minimal geometric distortions [...] Read more.

The application of diffusion models in the field of remote sensing image generation has significantly improved the performance of generation algorithms. However, existing methods still exhibit certain limitations, such as the inability to generate images with rich texture details and minimal geometric distortions in a controllable manner. To address these shortcomings, this work introduces an innovative remote sensing image generation algorithm, OP-Gen, which is guided by textual descriptions and OpenStreetMap (OSM) images. OP-Gen incorporates two information extraction branches: ControlNet and OSM-prompt (OP). The ControlNet branch extracts structural and spatial information from OSM images and injects this information into the diffusion model, providing guidance for the overall structural framework of the generated images. In the OP branch, we design an OP-Controller module, which extracts detailed semantic information from textual prompts based on the structural information of the OSM image. This information is subsequently injected into the diffusion model, enriching the generated images with fine-grained details, aligning the generated details with the structural framework, and thus significantly enhancing the realism of the output. The proposed OP-Gen algorithm achieves state-of-the-art performance in both qualitative and quantitative evaluations. The qualitative results demonstrate that OP-Gen outperforms existing methods in terms of structural coherence and texture detail richness. Quantitatively, the algorithm achieves a Fréchet inception distance (FID) of 45.01, a structural similarity index measure (SSIM) of 0.1904, and a Contrastive Language-Image Pretraining (CLIP) score of 0.3071, all of which represent the best performance among the current algorithms of the same type. Full article

► Show Figures

Figure 1

22 pages, 4627 KiB

Open AccessArticle

Exploration of Cross-Modal AIGC Integration in Unity3D for Game Art Creation

by Qinchuan Liu, Jiaqi Li and Wenjie Hu

Electronics 2025, 14(6), 1101; https://doi.org/10.3390/electronics14061101 - 11 Mar 2025

Viewed by 1432

Abstract

This advanced exploration of integrating cross-modal Artificial-Intelligence-Generated Content (AIGC) within the Unity3D game engine seeks to elevate the diversity and coherence of image generation in game art creation. The theoretical framework proposed dives into the seamless incorporation of generated visuals within Unity3D, introducing [...] Read more.

This advanced exploration of integrating cross-modal Artificial-Intelligence-Generated Content (AIGC) within the Unity3D game engine seeks to elevate the diversity and coherence of image generation in game art creation. The theoretical framework proposed dives into the seamless incorporation of generated visuals within Unity3D, introducing a novel Generative Adversarial Network (GAN) structure. In this architecture, both the Generator and Discriminator embrace a Transformer model, adeptly managing sequential data and long-range dependencies. Furthermore, the introduction of a cross-modal attention module enables the dynamic calculation of attention weights between text descriptors and generated imagery, allowing for real-time modulation of modal inputs, ultimately refining the quality and variety of generated visuals. The experimental results show outstanding performance on technical benchmarks, with an inception score reaching 8.95 and a Frechet Inception Distance plummeting to 20.1, signifying exceptional diversity and image quality. Surveys reveal that users rated the model’s output highly, citing both its adherence to text prompts and its strong visual allure. Moreover, the model demonstrates impressive stylistic variety, producing imagery with intricate and varied aesthetics. Though training demands are extended, the payoff in quality and diversity holds substantial practical value. This method exhibits substantial transformative potential in Unity3D development, simultaneously improving development efficiency and optimizing the visual fidelity of game assets. Full article

► Show Figures

Figure 1

19 pages, 10633 KiB

Open AccessArticle

RSVQ-Diffusion Model for Text-to-Remote-Sensing Image Generation

by Xin Gao, Yao Fu, Xiaonan Jiang, Fanlu Wu, Yu Zhang, Tianjiao Fu, Chao Li and Junyan Pei

Appl. Sci. 2025, 15(3), 1121; https://doi.org/10.3390/app15031121 - 23 Jan 2025

Cited by 1 | Viewed by 1822

Abstract

Despite significant challenges, the text-guided remote sensing image generation method shows great potential in many practical applications such as generative adversarial networks in remote sensing tasks; generated images still face challenges such as low realism, face challenges, and unclear details. Moreover, the inherent [...] Read more.

Despite significant challenges, the text-guided remote sensing image generation method shows great potential in many practical applications such as generative adversarial networks in remote sensing tasks; generated images still face challenges such as low realism, face challenges, and unclear details. Moreover, the inherent spatial complexity of remote sensing images and the limited scale of publicly available datasets make it particularly challenging to generate high-quality remote sensing images from text descriptions. To address these challenges, this paper proposes the RSVQ-Diffusion model for remote sensing image generation, achieving high-quality text-to-remote-sensing image generation applicable for target detection, simulation, and other fields. Specifically, this paper designs a spatial position encoding mechanism to integrate the spatial information of remote sensing images during model training. Additionally, the Transformer module is improved by incorporating a short-sequence local perception mechanism into the diffusion image decoder, addressing issues of unclear details and regional distortions in generated remote sensing images. Compared with the VQ-Diffusion model, our proposed model achieves significant improvements in the Fréchet Inception Distance (FID), the Inception Score (IS), and the text–image alignment (Contrastive Language-Image Pre-training, CLIP) scores. The FID score successfully decreased from 96.68 to 90.36; the CLIP score increased from 26.92 to 27.22, and the IS increased from 7.11 to 7.24. Full article

► Show Figures

Figure 1

20 pages, 2857 KiB

Open AccessArticle

SCAGAN: Wireless Capsule Endoscopy Lesion Image Generation Model Based on GAN

by Zhiguo Xiao, Dong Zhang, Xianqing Chen and Dongni Li

Electronics 2025, 14(3), 428; https://doi.org/10.3390/electronics14030428 - 22 Jan 2025

Viewed by 1059

Abstract

The wireless capsule endoscope (WCE) has been utilized for human digestive tract examinations for over 20 years. Given the complex environment of the digestive tract and the challenge of detecting multi-category lesion images, enhancing model generalization ability is crucial. However, traditional data augmentation [...] Read more.

The wireless capsule endoscope (WCE) has been utilized for human digestive tract examinations for over 20 years. Given the complex environment of the digestive tract and the challenge of detecting multi-category lesion images, enhancing model generalization ability is crucial. However, traditional data augmentation methods struggle to generate sufficiently diverse data. In this study, we propose a novel generative adversarial network, Special Common Attention Generative Adversarial Network (SCAGAN), to generate lesion images for capsule endoscopy. The SCAGAN model can adaptively integrate both the internal features and external global dependencies of the samples, enabling the generator to not only accurately capture the key structures and features of capsule endoscopic images, but also enhance the modeling of lesion complexity. Additionally, SCAGAN incorporates global context information to improve the overall consistency and detail of the generated images. To further enhance adaptability, self-modulation normalization is used, along with the Structural Similarity Index (SSIM) loss function to ensure structural authenticity. The Differentiable Data Augmentation (DiffAug) technique is employed to improve the model’s performance in small sample environments and balance the training process by adjusting learning rates to address issues of slow learning due to discriminator regularization. Experimental results show that SCAGAN significantly improves image quality and diversity, achieving state-of-the-art (SOTA) performance in the Frechet Inception Distance (FID) index. Moreover, when the generated lesion images were added to the dataset, the mean average precision (mAP) of the YOLOv9-based lesion detection model increased by 1.495%, demonstrating SCAGAN’s effectiveness in optimizing lesion detection. SCAGAN effectively addresses the challenges of lesion image generation for capsule endoscopy, improving both image quality and detection model performance. The proposed approach offers a promising solution for enhancing the training of lesion detection models in the context of capsule endoscopy. Full article

► Show Figures

Figure 1

32 pages, 8354 KiB

Open AccessArticle

Estimation of Fractal Dimension and Detection of Fake Finger-Vein Images for Finger-Vein Recognition

by Seung Gu Kim, Jin Seong Hong, Jung Soo Kim and Kang Ryoung Park

Fractal Fract. 2024, 8(11), 646; https://doi.org/10.3390/fractalfract8110646 - 31 Oct 2024

Cited by 3 | Viewed by 1428

Abstract

With recent advancements in deep learning, spoofing techniques have developed and generative adversarial networks (GANs) have become an emerging threat to finger-vein recognition systems. Therefore, previous research has been performed to generate finger-vein images for training spoof detectors. However, these are limited and [...] Read more.

With recent advancements in deep learning, spoofing techniques have developed and generative adversarial networks (GANs) have become an emerging threat to finger-vein recognition systems. Therefore, previous research has been performed to generate finger-vein images for training spoof detectors. However, these are limited and researchers still cannot generate elaborate fake finger-vein images. Therefore, we develop a new densely updated contrastive learning-based self-attention generative adversarial network (DCS-GAN) to create elaborate fake finger-vein images, enabling the training of corresponding spoof detectors. Additionally, we propose an enhanced convolutional network for a next-dimension (ConvNeXt)-Small model with a large kernel attention module as a new spoof detector capable of distinguishing the generated fake finger-vein images. To improve the spoof detection performance of the proposed method, we introduce fractal dimension estimation to analyze the complexity and irregularity of class activation maps from real and fake finger-vein images, enabling the generation of more realistic and sophisticated fake finger-vein images. Experimental results obtained using two open databases showed that the fake images by the DCS-GAN exhibited Frechet inception distances (FID) of 7.601 and 23.351, with Wasserstein distances (WD) of 18.158 and 10.123, respectively, confirming the possibility of spoof attacks when using existing state-of-the-art (SOTA) frameworks of spoof detection. Furthermore, experiments conducted with the proposed spoof detector yielded average classification error rates of 0.4% and 0.12% on the two aforementioned open databases, respectively, outperforming existing SOTA methods for spoof detection. Full article

(This article belongs to the Special Issue Fractional Order Complex Systems: Advanced Control, Intelligent Estimation and Reinforcement Learning Image Processing Algorithms)

► Show Figures

Figure 1

18 pages, 59323 KiB

Open AccessArticle

Method for Augmenting Side-Scan Sonar Seafloor Sediment Image Dataset Based on BCEL1-CBAM-INGAN

by Haixing Xia, Yang Cui, Shaohua Jin, Gang Bian, Wei Zhang and Chengyang Peng

J. Imaging 2024, 10(9), 233; https://doi.org/10.3390/jimaging10090233 - 20 Sep 2024

Cited by 1 | Viewed by 1045

Abstract

In this paper, a method for augmenting samples of side-scan sonar seafloor sediment images based on CBAM-BCEL1-INGAN is proposed, aiming to address the difficulties in acquiring and labeling datasets, as well as the insufficient diversity and quantity of data samples. Firstly, a Convolutional [...] Read more.

In this paper, a method for augmenting samples of side-scan sonar seafloor sediment images based on CBAM-BCEL1-INGAN is proposed, aiming to address the difficulties in acquiring and labeling datasets, as well as the insufficient diversity and quantity of data samples. Firstly, a Convolutional Block Attention Module (CBAM) is integrated into the residual blocks of the INGAN generator to enhance the learning of specific attributes and improve the quality of the generated images. Secondly, a BCEL1 loss function (combining binary cross-entropy and L1 loss functions) is introduced into the discriminator, enabling it to focus on both global image consistency and finer distinctions for better generation results. Finally, augmented samples are input into an AlexNet classifier to verify their authenticity. Experimental results demonstrate the excellent performance of the method in generating images of coarse sand, gravel, and bedrock, as evidenced by significant improvements in the Frechet Inception Distance (FID) and Inception Score (IS). The introduction of the CBAM and BCEL1 loss function notably enhances the quality and details of the generated images. Moreover, classification experiments using the AlexNet classifier show an increase in the recognition rate from 90.5% using only INGAN-generated images of bedrock to 97.3% using images augmented using our method, marking a 6.8% improvement. Additionally, the classification accuracy of bedrock-type matrices is improved by 5.2% when images enhanced using the method presented in this paper are added to the training set, which is 2.7% higher than that of the simple method amplification. This validates the effectiveness of our method in the task of generating seafloor sediment images, partially alleviating the scarcity of side-scan sonar seafloor sediment image data. Full article

(This article belongs to the Section Image and Video Processing)

► Show Figures

Figure 1

14 pages, 469 KiB

Open AccessArticle

Generating Artistic Portraits from Face Photos with Feature Disentanglement and Reconstruction

by Haoran Guo, Zhe Ma, Xuhesheng Chen, Xukang Wang, Jun Xu and Yangming Zheng

Electronics 2024, 13(5), 955; https://doi.org/10.3390/electronics13050955 - 1 Mar 2024

Cited by 5 | Viewed by 3719

Abstract

Generating artistic portraits from face photos presents a complex challenge that requires high-quality image synthesis and a deep understanding of artistic style and facial features. Traditional generative adversarial networks (GANs) have made significant strides in image synthesis; however, they encounter limitations in artistic [...] Read more.

Generating artistic portraits from face photos presents a complex challenge that requires high-quality image synthesis and a deep understanding of artistic style and facial features. Traditional generative adversarial networks (GANs) have made significant strides in image synthesis; however, they encounter limitations in artistic portrait generation, particularly in the nuanced disentanglement and reconstruction of facial features and artistic styles. This paper introduces a novel approach that overcomes these limitations by employing feature disentanglement and reconstruction techniques, enabling the generation of artistic portraits that more faithfully retain the subject’s identity and expressiveness while incorporating diverse artistic styles. Our method integrates six key components: a U-Net-based image generator, an image discriminator, a feature-disentanglement module, a feature-reconstruction module, a U-Net-based information generator, and a cross-modal fusion module, working in concert to transform face photos into artistic portraits. Through extensive experiments on the APDrawing dataset, our approach demonstrated superior performance in visual quality, achieving a significant reduction in the Fréchet Inception Distance (FID) score to 61.23, highlighting its ability to generate more-realistic and -diverse artistic portraits compared to existing methods. Ablation studies further validated the effectiveness of each component in our method, underscoring the importance of feature disentanglement and reconstruction in enhancing the artistic quality of the generated portraits. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

19 pages, 5054 KiB

Open AccessArticle

Real-Time Detection Algorithm for Kiwifruit Canker Based on a Lightweight and Efficient Generative Adversarial Network

by Ying Xiang, Jia Yao, Yiyu Yang, Kaikai Yao, Cuiping Wu, Xiaobin Yue, Zhenghao Li, Miaomiao Ma, Jie Zhang and Guoshu Gong

Plants 2023, 12(17), 3053; https://doi.org/10.3390/plants12173053 - 25 Aug 2023

Cited by 10 | Viewed by 2098

Abstract

Disease diagnosis and control play important roles in agriculture and crop protection. Traditional methods of identifying plant disease rely primarily on human vision and manual inspection, which are subjective, have low accuracy, and make it difficult to estimate the situation in real time. [...] Read more.

Disease diagnosis and control play important roles in agriculture and crop protection. Traditional methods of identifying plant disease rely primarily on human vision and manual inspection, which are subjective, have low accuracy, and make it difficult to estimate the situation in real time. At present, an intelligent detection technology based on computer vision is becoming an increasingly important tool used to monitor and control crop disease. However, the use of this technology often requires the collection of a substantial amount of specialized data in advance. Due to the seasonality and uncertainty of many crop pathogeneses, as well as some rare diseases or rare species, such data requirements are difficult to meet, leading to difficulties in achieving high levels of detection accuracy. Here, we use kiwifruit trunk bacterial canker (Pseudomonas syringae pv. actinidiae) as an example and propose a high-precision detection method to address the issue mentioned above. We introduce a lightweight and efficient image generative model capable of generating realistic and diverse images of kiwifruit trunk disease and expanding the original dataset. We also utilize the YOLOv8 model to perform disease detection; this model demonstrates real-time detection capability, taking only 0.01 s per image. The specific contributions of this study are as follows: (1) a depth-wise separable convolution is utilized to replace part of ordinary convolutions and introduce noise to improve the diversity of the generated images; (2) we propose the GASLE module by embedding a GAM, adjust the importance of different channels, and reduce the loss of spatial information; (3) we use an AdaMod optimizer to increase the convergence of the network; and (4) we select a real-time YOLOv8 model to perform effect verification. The results of this experiment show that the Fréchet Inception Distance (FID) of the proposed generative model reaches 84.18, having a decrease of 41.23 compared to FastGAN and a decrease of 2.1 compared to ProjectedGAN. The mean Average Precision (mAP@0.5) on the YOLOv8 network reaches 87.17%, which is nearly 17% higher than that of the original algorithm. These results substantiate the effectiveness of our generative model, providing a robust strategy for image generation and disease detection in plant kingdoms. Full article

(This article belongs to the Special Issue Precision Farming Application in Crop Protection)

► Show Figures

Figure 1

20 pages, 5452 KiB

Open AccessArticle

SUGAN: A Stable U-Net Based Generative Adversarial Network

by Shijie Cheng, Lingfeng Wang, Min Zhang, Cheng Zeng and Yan Meng

Sensors 2023, 23(17), 7338; https://doi.org/10.3390/s23177338 - 23 Aug 2023

Cited by 6 | Viewed by 5035

Abstract

As one of the representative models in the field of image generation, generative adversarial networks (GANs) face a significant challenge: how to make the best trade-off between the quality of generated images and training stability. The U-Net based GAN (U-Net GAN), a recently [...] Read more.

As one of the representative models in the field of image generation, generative adversarial networks (GANs) face a significant challenge: how to make the best trade-off between the quality of generated images and training stability. The U-Net based GAN (U-Net GAN), a recently developed approach, can generate high-quality synthetic images by using a U-Net architecture for the discriminator. However, this model may suffer from severe mode collapse. In this study, a stable U-Net GAN (SUGAN) is proposed to mainly solve this problem. First, a gradient normalization module is introduced to the discriminator of U-Net GAN. This module effectively reduces gradient magnitudes, thereby greatly alleviating the problems of gradient instability and overfitting. As a result, the training stability of the GAN model is improved. Additionally, in order to solve the problem of blurred edges of the generated images, a modified residual network is used in the generator. This modification enhances its ability to capture image details, leading to higher-definition generated images. Extensive experiments conducted on several datasets show that the proposed SUGAN significantly improves over the Inception Score (IS) and Fréchet Inception Distance (FID) metrics compared with several state-of-the-art and classic GANs. The training process of our SUGAN is stable, and the quality and diversity of the generated samples are higher. This clearly demonstrates the effectiveness of our approach for image generation tasks. The source code and trained model of our SUGAN have been publicly released. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

20 pages, 346 KiB

Open AccessArticle

Equivalent Base Expansions in the Space of Cliffordian Functions

by Mohra Zayed and Gamal Hassan

Axioms 2023, 12(6), 544; https://doi.org/10.3390/axioms12060544 - 31 May 2023

Cited by 3 | Viewed by 1181

Abstract

Intensive research efforts have been dedicated to the extension and development of essential aspects that resulted in the theory of one complex variable for higher-dimensional spaces. Clifford analysis was created several decades ago to provide an elegant and powerful generalization of complex analyses. [...] Read more.

Intensive research efforts have been dedicated to the extension and development of essential aspects that resulted in the theory of one complex variable for higher-dimensional spaces. Clifford analysis was created several decades ago to provide an elegant and powerful generalization of complex analyses. In this paper, first, we derive a new base of special monogenic polynomials (SMPs) in Fréchet–Cliffordian modules, named the equivalent base, and examine its convergence properties for several cases according to certain conditions applied to related constituent bases. Subsequently, we characterize its effectiveness in various convergence regions, such as closed balls, open balls, at the origin, and for all entire special monogenic functions (SMFs). Moreover, the upper and lower bounds of the order of the equivalent base are determined and proved to be attainable. This work improves and generalizes several existing results in the complex and Clifford context involving the convergence properties of the product and similar bases. Full article

(This article belongs to the Special Issue Recent Advances in Complex Analysis and Applications)

16 pages, 3679 KiB

Open AccessArticle

A Deep Learning Model for Ship Trajectory Prediction Using Automatic Identification System (AIS) Data

by Xinyu Wang and Yingjie Xiao

Information 2023, 14(4), 212; https://doi.org/10.3390/info14040212 - 30 Mar 2023

Cited by 19 | Viewed by 6582

Abstract

The rapid growth of ship traffic leads to traffic congestion, which causes maritime accidents. Accurate ship trajectory prediction can improve the efficiency of navigation and maritime traffic safety. Previous studies have focused on developing a ship trajectory prediction model using a deep learning [...] Read more.

The rapid growth of ship traffic leads to traffic congestion, which causes maritime accidents. Accurate ship trajectory prediction can improve the efficiency of navigation and maritime traffic safety. Previous studies have focused on developing a ship trajectory prediction model using a deep learning approach, such as a long short-term memory (LSTM) network. However, a convolutional neural network (CNN) has rarely been applied to extract the potential correlation among different variables (e.g., longitude, latitude, speed, course over ground, etc.). Therefore, this study proposes a deep-learning-based ship trajectory prediction model (namely, CNN-LSTM-SE) that considers the potential correlation of variables and temporal characteristics. This model integrates a CNN module, an LSTM module and a squeeze-and-excitation (SE) module. The CNN module is utilized to extract data on the relationship among different variables (e.g., longitude, latitude, speed and course over ground), the LSTM module is applied to capture temporal dependencies, and the SE module is introduced to adaptively adjust the importance of channel features and focus on the more significant ones. Comparison experiments of two cargo ships at a time interval of 10 s show that the proposed CNN-LSTM-SE model can obtain the best prediction performance compared with other models on evaluation indexes of average root mean squared error (ARMSE), average mean absolute percentage error (AMAPE), average Euclidean distance (AED), average ground distance (AGD) and Fréchet distance (FD). Full article

► Show Figures

Figure 1

Search Results (21)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (21)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI