MDPI - Publisher of Open Access Journals

23 pages, 2315 KB

Open AccessArticle

Unsupervised Metal Artifact Reduction in Dental CBCT Using Fine-Tuned Cycle-Consistent Adversarial Networks

by Thamindu Chamika, Sithum N. A. Dhanapala, Sasindu Nimalaweera, Maheshi B. Dissanayake and Ruwan D. Jayasinghe

Digital 2026, 6(2), 31; https://doi.org/10.3390/digital6020031 - 17 Apr 2026

Viewed by 179

Abstract

Metal artifacts generated by dental implants significantly degrade cone-beam computed tomography (CBCT) volumes, obscuring critical anatomical structures and compromising diagnostic precision. To address this, an unsupervised deep learning framework has been proposed for Metal Artifact Reduction (MAR) utilizing a Cycle-Consistent Adversarial Network (CycleGAN) [...] Read more.

Metal artifacts generated by dental implants significantly degrade cone-beam computed tomography (CBCT) volumes, obscuring critical anatomical structures and compromising diagnostic precision. To address this, an unsupervised deep learning framework has been proposed for Metal Artifact Reduction (MAR) utilizing a Cycle-Consistent Adversarial Network (CycleGAN) optimized for high-fidelity restoration. Unlike supervised methods that rely on unattainable voxel-aligned paired datasets, the proposed approach leverages an unpaired dataset of approximately 4000 images, curated from the public ToothFairy dataset. The architecture integrates U-Net-based generators and PatchGAN discriminators, specifically tuned to mitigate generative hallucinations and preserve morphological integrity. Quantitative benchmarking on a held-out test set demonstrates a 34.6% improvement in the Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) score, a substantial reduction in Fréchet Inception Distance (FID) from 207.03 to 157.04, and a superior Structural Similarity Index Measure (SSIM) of 0.9105. The framework achieves real-time efficiency with a 3.03 ms inference time per slice, effectively suppressing artifacts while preserving anatomical detail. Expert validation confirms high fidelity; however, to ensure reliability in extreme cases, the architecture is recommended as a clinical decision-support tool under human-in-the-loop oversight. By enhancing diagnostic clarity via a scalable software pipeline, this study provides a robust solution for high-fidelity dental implant imaging. Full article

► Show Figures

Figure 1

23 pages, 10813 KB

Open AccessArticle

Cross-Breed Few-Shot Learning for Pig Detection via Improved YOLOv7 and CycleGAN-Based Sample Generation

by Yizheng Zhuang, Lingyao Xu, Jinyun Jiang, Zhenyang Zhang, Yiting Wang, Pengfei Yu, Yihan Fu, Haoqi Xu, Wei Zhao, Xiaoliang Hou, Jianlan Wang, Yongqi He, Yan Fu, Zhe Zhang, Qishan Wang, Yuchun Pan and Zhen Wang

Biology 2026, 15(8), 623; https://doi.org/10.3390/biology15080623 - 16 Apr 2026

Viewed by 234

Abstract

Complex farming environments, breed variation, and the high cost of manual annotation remain major obstacles to robust pig detection, while cross-breed detection under few-shot conditions has been insufficiently explored in previous studies. To address this gap, we propose a few-shot pig detection framework [...] Read more.

Complex farming environments, breed variation, and the high cost of manual annotation remain major obstacles to robust pig detection, while cross-breed detection under few-shot conditions has been insufficiently explored in previous studies. To address this gap, we propose a few-shot pig detection framework that combines an improved YOLOv7 detector with CycleGAN-based pseudo-sample generation. The detector was enhanced through anchor optimization, Efficient Channel Attention (ECA), and Log-Sum-Exp (LSE) pooling to improve localization and feature discrimination in dense pigsty scenes. In addition, an optimized CycleGAN with perceptual loss was used to generate synthetic Duroc-like pig images to enrich the limited target-domain training set. The framework was evaluated using a two-dataset design: a White Pig Base Dataset was used to establish the source-domain detector and validate the architectural improvements, whereas a Duroc Pig Few-Shot Dataset was used to assess cross-breed adaptation under a 10-shot setting. The experimental results show that the proposed method achieved 98.16% mAP on the White pig dataset and 85.52% mAP on the Duroc Few-Shot Dataset. On the Duroc Few-Shot Dataset, the final framework outperformed Faster R-CNN, CenterNet, and YOLOv8, and also surpassed DCGAN- and SRGAN-based augmentation strategies. These results indicate that the proposed method provides an effective and practical solution for cross-breed few-shot pig detection, with potential value for intelligent livestock monitoring under annotation-limited conditions. Full article

(This article belongs to the Section Bioinformatics)

► Show Figures

Figure 1

18 pages, 1365 KB

Open AccessArticle

DA-CycleGAN: Degradation-Adaptive Unpaired Super-Resolution for Historical Image Restoration

by Lujun Zhai, Yonghui Wang, Yu Zhou and Suxia Cui

J. Imaging 2026, 12(4), 155; https://doi.org/10.3390/jimaging12040155 - 3 Apr 2026

Viewed by 391

Abstract

Historical images as the dominant method for documenting the world and its inhabitants can help us to better understand the real history. Due to the limited camera technology, historical images captured in the early to mid-20th century tend to be very blurry, unclear, [...] Read more.

Historical images as the dominant method for documenting the world and its inhabitants can help us to better understand the real history. Due to the limited camera technology, historical images captured in the early to mid-20th century tend to be very blurry, unclear, noisy, and obscure. The goal of this paper is to super-resolve images for historical image restoration. Compared to the degradations in modern digital imagery, those in historical images have unique features that are typically much more complex and less well understood. The discrepancy between historical images and modern high-definition digital images leads to a significant performance drop for existing super-resolution (SR) models trained on modern digital imagery. To tackle this problem, we propose a new method, namely DA-CycleGAN. Specifically, the DA-CycleGAN is built on top of CycleGAN to achieve unsupervised learning. We introduce a degradation-adaptive (DA) module with strong, flexible adaptation to learn various unknown degradations from samples. Moreover, we collect a large dataset containing 10,000 low-resolution images from real historical films. The dataset features various natural degradations. Our experimental results demonstrate the superior performance of DA-CycleGAN and the effectiveness of our image dataset for achieving accurate super-resolution enhancement of historical images. Full article

(This article belongs to the Section Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

26 pages, 9198 KB

Open AccessArticle

Towards Pseudo-Labeling with Dynamic Thresholds for Cross-View Image Geolocalization

by Yuanyuan Yuan, Jianzhong Guo, Ruoxin Zhu, Ning Li, Ziwei Li and Weiran Luo

Remote Sens. 2026, 18(6), 944; https://doi.org/10.3390/rs18060944 - 20 Mar 2026

Viewed by 349

Abstract

Cross-view image geolocalization aims to achieve accurate localization of geo-tagged images without geo-tagging by matching ground-view images with satellite images. However, there are huge imaging differences between ground and satellite viewpoints, and existing methods usually rely on a large number of accurately labeled [...] Read more.

Cross-view image geolocalization aims to achieve accurate localization of geo-tagged images without geo-tagging by matching ground-view images with satellite images. However, there are huge imaging differences between ground and satellite viewpoints, and existing methods usually rely on a large number of accurately labeled cross-view image pairs. Therefore, to address issues such as significant perspective differences, high annotation costs, and low utilization of unpaired data, this paper proposes a cross-view generation model that integrates multi-scale contrastive learning and dynamic optimization, designs a multi-scale contrast loss function to strengthen the semantic consistency between the generated images and the target domain, adaptively balances the quality and quantity of pseudo-labels according to a dynamic threshold screening mechanism, and introduces a hard-sample triplet loss to enhance the model discriminative ability. Ablation experiments on the CVUSA and CVACT datasets show that the BEV-CycleGAN+CL (Bird’s-Eye View Cycle-Consistent Generative Adversarial Network with Contrastive Learning) model proposed in this paper significantly outperforms the comparative models in PSNR, SSIM, and RMSE metrics. Specifically, on the CVACT dataset, compared with the BEV-CycleGAN, BEV, and CycleGAN baselines, PSNR increased by 2.83%, 16.02%, and 42.30%, SSIM increased by 6.12%, 8.00%, and 18.48%, and RMSE decreased by 9.28%, 15.51%, and 25.35%, respectively. Similar advantages are observed on the CVUSA dataset. Compared with current state-of-the-art models, the dynamic threshold pseudo-label localization method in this paper demonstrates overall superiority in recall metrics such as R@1, R@5, R@10, and R@1%, for example achieving an R@1 of 98.94% on CVUSA, outperforming the best comparative model, Sample4G^†, which reached 98.68%. This study provides innovative methodological support for disaster emergency response, high-precision map construction for autonomous driving, military reconnaissance, and other applications. Full article

(This article belongs to the Special Issue Image Matching and Target Recognition Technologies: Prospects and Challenges)

► Show Figures

Figure 1

30 pages, 1397 KB

Open AccessArticle

GAN-Based Cross-Modality Brain MRI Synthesis: Paired Versus Unpaired Training and Comparison with Diffusion and Transformer Models

by Behnam Kiani Kalejahi, Sebelan Danishvar and Mohammad Javad Rajabi

Biomimetics 2026, 11(3), 175; https://doi.org/10.3390/biomimetics11030175 - 2 Mar 2026

Viewed by 811

Abstract

Incomplete or faulty MRI sequences are common in clinical practice and can impair AI-based analyses that rely on complete multi-contrast data. The relative effectiveness of classical generative adversarial networks (GANs) versus modern diffusion and transformer-based models for clinically usable MRI synthesis remains unclear. [...] Read more.

Incomplete or faulty MRI sequences are common in clinical practice and can impair AI-based analyses that rely on complete multi-contrast data. The relative effectiveness of classical generative adversarial networks (GANs) versus modern diffusion and transformer-based models for clinically usable MRI synthesis remains unclear. This study evaluates cross-modality MRI synthesis using the BraTS 2019 brain tumour dataset, focusing on T1-to-T2 translation. We assess paired and unpaired CycleGAN models and compare them with two stronger but computationally intensive baselines, a conditional denoising diffusion probabilistic model (DDPM) and a transformer-enhanced GAN, using identical data splits and preprocessing pipelines. Inter-modality correlation was evaluated to estimate the achievable similarity between modalities. Conceptually, modality synthesis may be viewed as a representation-learning approach that compensates for missing imaging information by reconstructing clinically relevant features from available contrasts. Paired CycleGAN achieved correlations of

r \approx 0.92 - 0.93

and SSIM

\approx 0.90 - 0.92

, approaching natural T1–T2 correlation (

r \approx 0.95

) while maintaining very fast inference (<50 ms/slice). Unpaired CycleGAN achieved

r \approx 0.74 - 0.78

and SSIM

\approx 0.82 - 0.85

, producing clinically interpretable reconstructions without voxel-level supervision. DDPM achieved the highest fidelity (SSIM

\approx 0.93 - 0.95

,

r \approx 0.94

) but required substantially greater computational resources, while transformer-enhanced GAN performance was intermediate. Qualitative analysis showed that CycleGAN and DDPM best preserved tumour and tissue boundaries, whereas unpaired CycleGAN occasionally over-smoothed subtle lesions. These findings highlight the trade-off between fidelity and efficiency in cross-modality MRI synthesis, suggesting paired CycleGAN for time-sensitive clinical workflows and diffusion models as a computationally expensive accuracy upper bound. Full article

(This article belongs to the Special Issue Bio-Inspired Intelligence: Bridging Neural Networks, Artificial Intelligence (AI), and Biomimetics for Next-Generation Innovation)

► Show Figures

Figure 1

36 pages, 124129 KB

Open AccessArticle

Spatial–Spectral Fusion 3D Signal Compensation for Moon Mineralogy Mapper (M3) Hyperspectral Images in Low-Signal Lunar Polar Regions

by Rui Ni, Tingyu Meng, Fei Zhao, Yanan Dang, Wenbin Zhang and Pingping Lu

Remote Sens. 2026, 18(5), 682; https://doi.org/10.3390/rs18050682 - 25 Feb 2026

Viewed by 471

Abstract

Hyperspectral images (HSIs) from the lunar polar regions are frequently compromised by low signal-to-noise ratio (SNR) under adverse illumination, limiting their utility for scientific analysis. Existing spectral-only compensation approaches operate without spatial context, leading to speckle-like artifacts that degrade spatial consistency and constrain [...] Read more.

Hyperspectral images (HSIs) from the lunar polar regions are frequently compromised by low signal-to-noise ratio (SNR) under adverse illumination, limiting their utility for scientific analysis. Existing spectral-only compensation approaches operate without spatial context, leading to speckle-like artifacts that degrade spatial consistency and constrain subsequent applications. To address this limitation, we propose SSF-3DSC, a spatial–spectral fusion 3D signal-compensation framework tailored for lunar HSIs to simultaneously restore spectral fidelity and spatial consistency under extreme low-illumination conditions. To the best of our knowledge, this represents the first deep learning framework specifically engineered for joint spatial–spectral restoration in the photon-starved regime. SSF-3DSC integrates three specialized components: a spectral compensation module (SCM) for restoring spectral fidelity, a multi-scale spatial attention (MSA) module for capturing hierarchical spatial patterns, and a cascaded 3D residual convolutional module (C3D-RCM) for refining spatial–spectral representations. Trained on paired low- and high-SNR Moon Mineralogy Mapper (M3) data cubes from the lunar south polar region, SSF-3DSC employs synergistic spatial–spectral fusion to achieve high-fidelity reconstruction, significantly outperforming a spectral-only lunar baseline (Paired-CycleGAN). Regional-scale experiments demonstrate its ability to recover both spatially coherent geological structures and spectrally reliable mineral abundance maps. By establishing a new benchmark for lunar HSI restoration under low-illumination conditions, this work enhances the scientific utility of low-signal M3 data and enables robust quantitative investigations into the Moon’s challenging polar regions. Full article

(This article belongs to the Special Issue Advances in Scene Understanding with Hyperspectral Remote Sensing: From Data Benchmarks to Applications)

► Show Figures

Figure 1

28 pages, 5365 KB

Open AccessArticle

Early Remaining Useful Life Prediction of Lithium-Ion Batteries Based on a Hybrid Machine Learning Method with Time Series Augmentation

by Jingwei Zhang, Jian Huang, Taihua Zhang, Erbao He, Sipeng Wang and Liguo Yao

Sensors 2026, 26(4), 1238; https://doi.org/10.3390/s26041238 - 13 Feb 2026

Viewed by 556

Abstract

Early and accurate prediction of the remaining useful life (RUL), defined as the number of operational cycles a battery can continue to function before reaching its end-of-life threshold, is crucial for improving the reliability of new energy vehicles. To address noise contamination, capacity [...] Read more.

Early and accurate prediction of the remaining useful life (RUL), defined as the number of operational cycles a battery can continue to function before reaching its end-of-life threshold, is crucial for improving the reliability of new energy vehicles. To address noise contamination, capacity regeneration effects, and data scarcity in early-stage prognostics, this paper proposes a hybrid framework integrating signal decomposition, time series augmentation, and deep forecasting. The raw capacity sequence is decomposed using Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) to separate multi-scale components. A Transformer-enhanced time series generative adversarial network (HyT-GAN) is then employed to augment decomposed components, improving robustness under small-sample conditions. A CNN-BiGRU predictor is trained for capacity forecasting, and key hyperparameters are tuned via the Dung Beetle Optimizer (DBO). Experiments on NASA and CALCE benchmark datasets demonstrate that the proposed method achieves accurate early-stage prediction using only 20% historical data, with

R^{2}

ranging from 0.9643 to 0.9972 and RMSE/MAE below 0.0296/0.0198. These results indicate that the proposed framework can deliver reliable RUL estimates under data-limited and noisy measurement conditions. Full article

(This article belongs to the Special Issue Smart Sensing, Innovative Analysis and Optimal Operation of Distribution Systems)

► Show Figures

Figure 1

22 pages, 2506 KB

Open AccessEditor’s ChoiceArticle

CycleGAN-Based Data Augmentation for Scanning Electron Microscope Images to Enhance Integrated Circuit Manufacturing Defect Classification

by Andrew Yen, Nemo Chang, Jean Chien, Lily Chuang and Eric Lee

Electronics 2026, 15(4), 803; https://doi.org/10.3390/electronics15040803 - 13 Feb 2026

Viewed by 434

Abstract

Semiconductor defect inspection is frequently hindered by data scarcity and the resulting class imbalance in supervised learning. This study proposes a CycleGAN-based data augmentation pipeline designed to synthesize realistic defective CD-SEM images from abundant normal patterns, incorporating a quantitative quality control mechanism. Using [...] Read more.

Semiconductor defect inspection is frequently hindered by data scarcity and the resulting class imbalance in supervised learning. This study proposes a CycleGAN-based data augmentation pipeline designed to synthesize realistic defective CD-SEM images from abundant normal patterns, incorporating a quantitative quality control mechanism. Using an ADI CD-SEM dataset, we conducted a sensitivity analysis by cropping original 1024 × 1024 micrographs into 512 × 512 and 256 × 256 inputs. Our results indicate that increasing the effective defect-area ratio is critical for improving generative stability and defect visibility. To ensure data integrity, we applied a screening protocol based on the Structural Similarity Index (SSIM) and a median absolute deviation noise metric to exclude low-fidelity outputs. When integrated into the training of XceptionNet classifiers, this filtered augmentation strategy yielded substantial performance gains on a held-out test set, specifically improving the Recall and F1 score while maintaining a near-ceiling AUC. These results demonstrate that controlled CycleGAN augmentation, coupled with objective quality filtering, effectively mitigates class imbalance constraints and significantly enhances the robustness of automated defect detection. Full article

(This article belongs to the Special Issue Deep Learning in Video and Image Processing: Challenges, Solutions, and Future Directions, 2nd Edition)

► Show Figures

Figure 1

24 pages, 5772 KB

Open AccessArticle

Method for Generating Pseudo-NDVI from RVI Derived from Satellite-Borne SAR Imagery Data Using CycleGAN and pix2pix Models

by Kohei Arai, Ria Maruta and Hiroshi Okumura

Information 2026, 17(2), 154; https://doi.org/10.3390/info17020154 - 3 Feb 2026

Viewed by 1170

Abstract

Continuous vegetation monitoring is essential for predicting crop varieties and yields; however, optical satellite data are frequently unavailable due to cloud cover. To overcome this limitation, this study proposes a method for generating pseudo-NDVI (Normalized Difference Vegetation Index) imagery from RVI (Radar Vegetation [...] Read more.

Continuous vegetation monitoring is essential for predicting crop varieties and yields; however, optical satellite data are frequently unavailable due to cloud cover. To overcome this limitation, this study proposes a method for generating pseudo-NDVI (Normalized Difference Vegetation Index) imagery from RVI (Radar Vegetation Index) derived from Synthetic Aperture Radar (SAR) data using Generative Adversarial Networks (GANs). Two architectures—pix2pixHD (supervised) and CycleGAN (unsupervised)—were evaluated using Sentinel-1 and Sentinel-2 data under identical conditions. By introducing RVI as an intermediate feature instead of directly converting SAR backscatter to NDVI, the proposed method enhanced physical interpretability and improved correlation with NDVI. Quantitative results show that pix2pix achieved higher accuracy (SSIM = 0.5667, PSNR = 22.24 dB, RMSE = 20.54) than CycleGAN (SSIM = 0.5240, PSNR = 19.54 dB, RMSE = 28.02), with further improvement when combining VV and VH polarization data. Although the absolute accuracy remains moderate, this approach enables continuous annual NDVI time series reconstruction for crop monitoring under persistent cloud conditions, demonstrating clear advantages over conventional direct SAR-to-NDVI conversion methods. Full article

(This article belongs to the Special Issue Next-Generation Vision Systems in Agriculture—Toward Explainable and Trustworthy AI)

► Show Figures

Figure 1

15 pages, 2072 KB

Open AccessArticle

A Ceramic Rare Defect Amplification Method Based on TC-CycleGAN

by Zhiqiang Zeng, Changying Dang, Zebing Ma, Jiansu Li and Zhonghua Li

Sensors 2026, 26(2), 395; https://doi.org/10.3390/s26020395 - 7 Jan 2026

Cited by 1 | Viewed by 412

Abstract

The ceramic defect detection technology based on deep learning suffers from the problems of scarce rare defect samples and class imbalance. However, the current deep generative image augmentation techniques are limited when applied to the task of augmenting rare ceramic defects due to [...] Read more.

The ceramic defect detection technology based on deep learning suffers from the problems of scarce rare defect samples and class imbalance. However, the current deep generative image augmentation techniques are limited when applied to the task of augmenting rare ceramic defects due to issues such as uneven image brightness and insufficient features of small-sized defects, resulting in poor image quality and limited improvement in detection results. This paper proposes a ceramic rare defect image augmentation method based on TC-CycleGAN. TC-CycleGAN is based on the CycleGAN framework and optimizes the generator and discriminator structures to make them more suitable for ceramic defect features, thereby improving the quality of generated images. The generator is TC-UNet, which introduces the scSE and DehazeFormer modules on the basis of UNet, effectively enhancing the model’s ability to learn the subtle defect features on the ceramic surface; the discriminator is the TC-PatchGAN architecture, which replaces the original BatchNorm module with the ContraNorm module, effectively increasing the discriminator’s sensitivity to the representation of tiny ceramic defect features and enhancing the diversity of generated images. The image quality assessment experiments show that the method proposed in this paper significantly improves the quality of generated defective images. For the concave type images, the FID and KID values have decreased by 49% and 73%, respectively, while for the smoke stains type images, the FID and KID values have decreased by 57% and 63% respectively. The further defect detection experiments results show that when using the data set expanded by the method in this paper for training, the recognition accuracy of the detection model for rare defects has significantly improved. The detection accuracy of the concave and smoke stains types of defects has increased by 1.2% and 3.9% respectively. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

27 pages, 7808 KB

Open AccessArticle

An Enhanced CycleGAN to Derive Temporally Continuous NDVI from Sentinel-1 SAR Images

by Anqi Wang, Zhiqiang Xiao, Chunyu Zhao, Juan Li, Yunteng Zhang, Jinling Song and Hua Yang

Remote Sens. 2026, 18(1), 56; https://doi.org/10.3390/rs18010056 - 24 Dec 2025

Viewed by 627

Abstract

Frequent cloud cover severely limits the use of optical remote sensing for continuous ecological monitoring. Synthetic aperture radar (SAR) offers an all-weather alternative, but translating SAR data to optical equivalents is challenging, particularly in cloudy regions where paired training data are scarce. To [...] Read more.

Frequent cloud cover severely limits the use of optical remote sensing for continuous ecological monitoring. Synthetic aperture radar (SAR) offers an all-weather alternative, but translating SAR data to optical equivalents is challenging, particularly in cloudy regions where paired training data are scarce. To address this, we developed an enhanced CycleGAN (denoted by SA-CycleGAN) to derive a high-fidelity, temporally continuous normalized difference vegetation index (NDVI) from SAR imagery. The SA-CycleGAN introduces a novel spatiotemporal attention generator that dynamically computes global and local feature relationships to capture long-range spatial dependencies across diverse landscapes. Furthermore, a structural similarity (SSIM) loss function is integrated into the SA-CycleGAN to preserve the structural and textural integrity of the synthesized images. The performance of the SA-CycleGAN and three unsupervised models (DualGAN, GP-UNIT, and DCLGAN) was evaluated by deriving NDVI time series from Sentinel-1 SAR images across four sites with different vegetation types. Ablation experiments were conducted to verify the contributions of the key components in the SA-CycleGAN model. The results demonstrate that the SA-CycleGAN significantly outperformed the comparison models across all four sites. Quantitatively, the proposed method achieved the lowest Root Mean Square Error (RMSE) of 0.0502 and the highest Coefficient of Determination (R²) of 0.88 at the Zhangbei and Xishuangbanna sites, respectively. The ablation experiments confirmed that the attention mechanism and SSIM loss function were crucial for capturing long-range features and maintaining spatial structure. The SA-CycleGAN proves to be a robust and effective solution for overcoming data gaps in optical time series. Full article

► Show Figures

Figure 1

28 pages, 4422 KB

Open AccessArticle

Enhanced Object Detection Algorithms in Complex Environments via Improved CycleGAN Data Augmentation and AS-YOLO Framework

by Zhen Li, Yuxuan Wang, Lingzhong Meng, Wenjuan Chu and Guang Yang

J. Imaging 2025, 11(12), 447; https://doi.org/10.3390/jimaging11120447 - 12 Dec 2025

Cited by 1 | Viewed by 1070

Abstract

Object detection in complex environments, such as challenging lighting conditions, adverse weather, and target occlusions, poses significant difficulties for existing algorithms. To address these challenges, this study introduces a collaborative solution integrating improved CycleGAN-based data augmentation and an enhanced object detection framework, AS-YOLO. [...] Read more.

Object detection in complex environments, such as challenging lighting conditions, adverse weather, and target occlusions, poses significant difficulties for existing algorithms. To address these challenges, this study introduces a collaborative solution integrating improved CycleGAN-based data augmentation and an enhanced object detection framework, AS-YOLO. The improved CycleGAN incorporates a dual self-attention mechanism and spectral normalization to enhance feature capture and training stability. The AS-YOLO framework integrates a channel–spatial parallel attention mechanism, an AFPN structure for improved feature fusion, and the Inner_IoU loss function for better generalization. The experimental results show that compared with YOLOv8n, mAP@0.5 and mAP@0.95 of the AS-YOLO algorithm have increased by 1.5% and 0.6%, respectively. After data augmentation and style transfer, mAP@0.5 and mAP@0.95 have increased by 14.6% and 17.8%, respectively, demonstrating the effectiveness of the proposed method in improving the performance of the model in complex scenarios. Full article

(This article belongs to the Special Issue Advances in Machine Learning for Computer Vision Applications)

► Show Figures

Figure 1

18 pages, 27194 KB

Open AccessArticle

A Synthetic Image Generation Pipeline for Vision-Based AI in Industrial Applications

by Nishanth Nandakumar and Jörg Eberhardt

Appl. Sci. 2025, 15(23), 12600; https://doi.org/10.3390/app152312600 - 28 Nov 2025

Cited by 1 | Viewed by 1708

Abstract

The collection and annotation of large-scale image datasets remains a significant challenge in training vision-based AI models, especially in domains such as industrial automation. In industrial settings, this limitation is especially critical for quality inspection tasks within Flexible Manufacturing Systems and Batch-Size-of-One production, [...] Read more.

The collection and annotation of large-scale image datasets remains a significant challenge in training vision-based AI models, especially in domains such as industrial automation. In industrial settings, this limitation is especially critical for quality inspection tasks within Flexible Manufacturing Systems and Batch-Size-of-One production, where high variability in components restricts the availability of relevant datasets. This study presents a pipeline for generating photorealistic synthetic images to support automated visual inspection. Rendered images derived from geometric models of manufactured parts are enhanced using a Cycle-Consistent Adversarial Network (CycleGAN), which transfers pixel-level features from real camera images. The pipeline is applied in two scenarios: (1) domain transfer between similar objects for data augmentation, and (2) domain transfer between dissimilar objects to synthesize images before physical production. The generated images are evaluated using mean Average Precision (mAP) and the Turing test, respectively. The pipeline is further validated in two industrial setups: object detection for a pick-and-place task using a Niryo robot, and anomaly detection in products manufactured by a FESTO machine. The successful implementation of the pipeline demonstrates its potential to generate effective training data for vision-based AI in industrial applications and highlights the importance of enhancing domain quality in industrial synthetic data workflows. Full article

(This article belongs to the Special Issue Artificial Intelligence for Industrial Informatics)

► Show Figures

Figure 1

19 pages, 4815 KB

Open AccessArticle

A Novel Anti-UAV Detection Method for Airport Safety Based on Style Transfer Learning and Deep Learning

by Ruiheng Zhang, Yitao Song, Ruoxi Zhang, Yang Lei, Hanglin Cheng and Jingtao Zhong

Electronics 2025, 14(23), 4620; https://doi.org/10.3390/electronics14234620 - 25 Nov 2025

Cited by 1 | Viewed by 676

Abstract

Unmanned aerial vehicle (UAV) intrusions cause flight delays and disrupt airport operations, so accurate monitoring is essential for safety. To address the scarcity and mismatch of real-world training data in small-target detection, an anti-UAV approach is proposed that integrates style transfer learning (STL) [...] Read more.

Unmanned aerial vehicle (UAV) intrusions cause flight delays and disrupt airport operations, so accurate monitoring is essential for safety. To address the scarcity and mismatch of real-world training data in small-target detection, an anti-UAV approach is proposed that integrates style transfer learning (STL) with deep learning. An airport monitoring platform is established to acquire a real UAV dataset, and a Cycle-Consistent Generative Adversarial Network (CycleGAN) is employed to synthesize multi-scene images that simulate diverse airport backgrounds, thereby enriching the training distribution. Using these simulated scenes, a controlled comparison of YOLOv5/YOLOv6/YOLOv7/YOLOv8 is conducted, in which YOLOv5 achieves the best predictive performance with AP values of 93.95%, 98.09%, and 97.07% across three scenarios. On public UAV datasets, the STL-enhanced model (YOLOv5_STL) is further compared with other small-object detectors and consistently exhibits superior performance, indicating strong cross-scene generalization. Overall, the proposed method provides an economical, real-time solution for airport UAV intrusion prevention while maintaining high accuracy and robustness. Full article

(This article belongs to the Special Issue Recent Advances in Applications of Machine Learning and Computer Vision)

► Show Figures

Figure 1

31 pages, 17949 KB

Open AccessArticle

Domain-Unified Adaptive Detection Framework for Small Vehicle Targets in Monostatic/Bistatic SAR Images

by Zheng Ye and Peng Zhou

Remote Sens. 2025, 17(22), 3671; https://doi.org/10.3390/rs17223671 - 7 Nov 2025

Viewed by 931

Abstract

Benefiting from the advantages of unmanned aerial vehicle (UAV) platforms such as low cost, rapid deployment capability, and miniaturization, the application of UAV-borne synthetic aperture radar (SAR) has developed rapidly. Utilizing a self-developed monostatic Miniaturized SAR (MiniSAR) system and a bistatic MiniSAR system, [...] Read more.

Benefiting from the advantages of unmanned aerial vehicle (UAV) platforms such as low cost, rapid deployment capability, and miniaturization, the application of UAV-borne synthetic aperture radar (SAR) has developed rapidly. Utilizing a self-developed monostatic Miniaturized SAR (MiniSAR) system and a bistatic MiniSAR system, our team conducted multiple imaging missions over the same vehicle equipment display area at different times. However, system disparities and time-varying factors lead to a mismatch between the distributions of the training and test data. Additionally, small ground vehicle targets under complex background clutter exhibit limited size and weak scattering characteristics. These two issues pose significant challenges to the precise detection of small ground vehicle targets. To address these issues, this article proposes a domain-unified adaptive target detection framework (DUA-TDF). The approach consists of two stages: image-to-image translation and feature extraction and target detection. In the first stage, a multi-scale detail-aware CycleGAN (MSDA-CycleGAN) is proposed to align the source and target domains at the image level by achieving unpaired image style transfer while emphasizing both global structure and local details of the generated images. In the second stage, a cross-window axial self-attention target detection network (CWASA-Net) is proposed. This network employs a hybrid backbone centered on the cross-window axial self-attention mechanism to enhance feature representation, coupled with a convolution-based stacked cross-scale feature fusion network to strengthen multi-scale feature interaction. To validate the effectiveness and generalization capability of the proposed algorithm, comprehensive experiments are conducted on both self-developed monostatic/bistatic SAR datasets and public dataset. Experimental results demonstrate that our method achieves an mAP50 exceeding 90% in within-domain tests and maintains over 80% in cross-domain scenarios, demonstrating exceptional and robust detection performance as well as cross-domain adaptability. Full article

(This article belongs to the Special Issue Advances in Synthetic Aperture Radar Data Processing and Application (Second Edition))

► Show Figures

Figure 1

Search Results (164)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (164)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI