Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (107)

Search Parameters:
Keywords = multi-discriminator generative adversarial network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
13 pages, 1883 KiB  
Article
A GAN-Based Method for Cognitive Covert Communication UAV Jamming-Assistance Under Fully Labeled Sample Conditions
by Wenxuan Fu, Bo Li, Haipeng Wang, Haochen Gong and Xiang Lin
Technologies 2025, 13(7), 283; https://doi.org/10.3390/technologies13070283 - 3 Jul 2025
Viewed by 237
Abstract
This paper addresses the optimization problem for mobile jamming assistance schemes in cognitive covert communication (CR-CC), where cognitive users adopt the underlying mode for spectrum access, while an unmanned aerial vehicle (UAV) transmits the same-frequency noise signals to interfere with eavesdroppers. Leveraging the [...] Read more.
This paper addresses the optimization problem for mobile jamming assistance schemes in cognitive covert communication (CR-CC), where cognitive users adopt the underlying mode for spectrum access, while an unmanned aerial vehicle (UAV) transmits the same-frequency noise signals to interfere with eavesdroppers. Leveraging the inherent dynamic game-theoretic characteristics of covert communication (CC) systems, we propose a novel covert communication optimization algorithm based on generative adversarial networks (GAN-CCs) to achieve system-wide optimization under the constraint of maximum detection error probability. In GAN-CC, the generator simulates legitimate users to generate UAV interference assistance schemes, while the discriminator simulates the optimal signal detection of eavesdroppers. Through the alternating iterative optimization of these two components, the dynamic game process in CC is simulated, ultimately achieving the Nash equilibrium. The numerical results show that, compared with the commonly used multi-objective optimization algorithm or nonlinear programming algorithm at present, this algorithm exhibits faster and more stable convergence, enabling the derivation of optimal mobile interference assistance schemes for cognitive CC systems. Full article
(This article belongs to the Section Information and Communication Technologies)
Show Figures

Figure 1

22 pages, 27201 KiB  
Article
Spatiotemporal Interactive Learning for Cloud Removal Based on Multi-Temporal SAR–Optical Images
by Chenrui Xu, Zhenfei Wang, Liang Chen and Xiangchao Meng
Remote Sens. 2025, 17(13), 2169; https://doi.org/10.3390/rs17132169 - 24 Jun 2025
Viewed by 361
Abstract
Optical remote sensing images suffer from information loss due to cloud interference, while Synthetic Aperture Radar (SAR), capable of all-weather and day–night imaging capabilities, provides crucial auxiliary data for cloud removal and reconstruction. However, existing cloud removal methods face the following key challenges: [...] Read more.
Optical remote sensing images suffer from information loss due to cloud interference, while Synthetic Aperture Radar (SAR), capable of all-weather and day–night imaging capabilities, provides crucial auxiliary data for cloud removal and reconstruction. However, existing cloud removal methods face the following key challenges: insufficient utilization of spatiotemporal information in multi-temporal data, and fusion challenges arising from fundamentally different imaging mechanisms between optical and SAR images. To address these challenges, a spatiotemporal feature interaction-based cloud removal method is proposed to effectively fuse SAR and optical images. Built upon a conditional generative adversarial network framework, the method incorporates three key modules: a multi-temporal spatiotemporal feature joint extraction module, a spatiotemporal information interaction module, and a spatiotemporal discriminator module. These components jointly establish a many-to-many spatiotemporal interactive learning network, which separately extracts and fuses spatiotemporal features from multi-temporal SAR–optical image pairs to generate temporally consistent, cloud-free image sequences. Experiments on both simulated and real datasets demonstrate the superior performance of the proposed method. Full article
Show Figures

Figure 1

32 pages, 6964 KiB  
Article
MDFT-GAN: A Multi-Domain Feature Transformer GAN for Bearing Fault Diagnosis Under Limited and Imbalanced Data Conditions
by Chenxi Guo, Vyacheslav V. Potekhin, Peng Li, Elena A. Kovalchuk and Jing Lian
Appl. Sci. 2025, 15(11), 6225; https://doi.org/10.3390/app15116225 - 31 May 2025
Viewed by 582
Abstract
In industrial scenarios, bearing fault diagnosis often suffers from data scarcity and class imbalance, which significantly hinders the generalization performance of data-driven models. While generative adversarial networks (GANs) have shown promise in data augmentation, their efficacy deteriorates in the presence of multi-category and [...] Read more.
In industrial scenarios, bearing fault diagnosis often suffers from data scarcity and class imbalance, which significantly hinders the generalization performance of data-driven models. While generative adversarial networks (GANs) have shown promise in data augmentation, their efficacy deteriorates in the presence of multi-category and structurally complex fault distributions. To address these challenges, this paper proposes a novel fault diagnosis framework based on a Multi-Domain Feature Transformer GAN (MDFT-GAN). Specifically, raw vibration signals are transformed into 2D RGB representations via joint time-domain, frequency-domain, and time–frequency-domain mappings, effectively encoding multi-perspective fault signatures. A Transformer-based feature extractor, integrated with Efficient Channel Attention (ECA), is embedded into both the generator and discriminator to capture global dependencies and channel-wise interactions, thereby enhancing the representation quality of synthetic samples. Furthermore, a gradient penalty (GP) term is introduced to stabilize adversarial training and suppress mode collapse. To improve classification performance, an Enhanced Hybrid Visual Transformer (EH-ViT) is constructed by coupling a lightweight convolutional stem with a ViT encoder, enabling robust and discriminative fault identification. Beyond performance metrics, this work also incorporates a Grad-CAM-based interpretability scheme to visualize hierarchical feature activation patterns within the discriminator, providing transparent insight into the model’s decision-making rationale across different fault types. Extensive experiments on the CWRU and Jiangnan University (JNU) bearing datasets validate that the proposed method achieves superior diagnostic accuracy, robustness under limited and imbalanced conditions, and enhanced interpretability compared to existing state-of-the-art approaches. Full article
(This article belongs to the Special Issue Explainable Artificial Intelligence Technology and Its Applications)
Show Figures

Figure 1

19 pages, 12185 KiB  
Article
Dual-Domain Adaptive Synergy GAN for Enhancing Low-Light Underwater Images
by Dechuan Kong, Jinglong Mao, Yandi Zhang, Xiaohu Zhao, Yanyan Wang and Shungang Wang
J. Mar. Sci. Eng. 2025, 13(6), 1092; https://doi.org/10.3390/jmse13061092 - 30 May 2025
Viewed by 621
Abstract
The increasing application of underwater robotic systems in deep-sea exploration, inspection, and resource extraction has created a strong demand for reliable visual perception under challenging conditions. However, image quality is severely degraded in low-light underwater environments due to the combined effects of light [...] Read more.
The increasing application of underwater robotic systems in deep-sea exploration, inspection, and resource extraction has created a strong demand for reliable visual perception under challenging conditions. However, image quality is severely degraded in low-light underwater environments due to the combined effects of light absorption and scattering, resulting in color imbalance, low contrast, and illumination instability. These factors limit the effectiveness of visual-based autonomous operations. We propose ATS-UGAN, a Dual-domain Adaptive Synergy Generative Adversarial Network for low-light underwater image enhancement to confront the above issues. The network integrates Multi-scale Hybrid Attention (MHA) that synergizes spatial and frequency domain representations to capture key image features adaptively. An Adaptive Parameterized Convolution (AP-Conv) module is introduced to handle non-uniform scattering by dynamically adjusting convolution kernels through a multi-branch design. In addition, a Dynamic Content-aware Markovian Discriminator (DCMD) is employed to perceive the dual-domain information synergistically, enhancing image texture realism and improving color correction. Extensive experiments on benchmark underwater datasets demonstrate that ATS-UGAN surpasses state-of-the-art approaches, achieving 28.7/0.92 PSNR/SSIM on EUVP and 28.2/0.91 on UFO-120. Additional reference and no-reference metrics further confirm the improved visual quality and realism of the enhanced images. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

30 pages, 4437 KiB  
Article
Smart Maritime Transportation-Oriented Ship-Speed Prediction Modeling Using Generative Adversarial Networks and Long Short-Term Memory
by Xinqiang Chen, Peishi Wu, Yajie Zhang, Xiaomeng Wang, Jiangfeng Xian and Han Zhang
J. Mar. Sci. Eng. 2025, 13(6), 1045; https://doi.org/10.3390/jmse13061045 - 26 May 2025
Viewed by 626
Abstract
Ship-speed prediction is an emerging research area in marine traffic safety and other related fields, occupying an important position with respect to these areas. At present, the effectiveness of techniques used in in time-series forecasting methods in ship-speed prediction is poor, and there [...] Read more.
Ship-speed prediction is an emerging research area in marine traffic safety and other related fields, occupying an important position with respect to these areas. At present, the effectiveness of techniques used in in time-series forecasting methods in ship-speed prediction is poor, and there are accumulated errors in long-term forecasting, which is limited in its processing of ship-speed information combined with multi-feature data input. To overcome this difficulty and further optimize the accuracy of ship-speed prediction, this research proposes a new deep learning framework to predict ship speed by combining GANs (Generative Adversarial Networks) and LSTM (Long Short-Term Memory). First, the algorithm takes an LSTM network as the generating network and uses the LSTM to mine the spatiotemporal correlation between nodes. Secondly, the complementary characteristics linked between the generative network and the discriminant network are used to eliminate the cumulative error of a single neural network in the long-term prediction process and improve the prediction accuracy of the network in ship-speed determination. To conclude, the Generator–LSTM model advanced here is used for ship-speed prediction and compared with other models, utilizing identical AIS (automatic identification system) ship-speed information in the same scene. The findings indicate that the model demonstrates high accuracy in the typical error measurement index, which means that the model can reliably better predict the ship speed. The results of the study will assist maritime traffic participants in better taking precautions to prevent collisions and improve maritime traffic safety. Full article
Show Figures

Figure 1

22 pages, 4109 KiB  
Article
An Unsupervised Anomaly Detection Method for Railway Fasteners Based on Knowledge-Distilled Generative Adversarial Networks
by Hongyan Chen, Zhiwei Li and Xinjie Xiao
Appl. Sci. 2025, 15(11), 5933; https://doi.org/10.3390/app15115933 - 24 May 2025
Viewed by 480
Abstract
The integrity and stability of railway fasteners are of vital importance to railway safety. To address the challenges of limited anomaly samples, irregular defect geometries, and complex operational conditions in rail fastener anomaly detection, this paper proposes an unsupervised anomaly detection method using [...] Read more.
The integrity and stability of railway fasteners are of vital importance to railway safety. To address the challenges of limited anomaly samples, irregular defect geometries, and complex operational conditions in rail fastener anomaly detection, this paper proposes an unsupervised anomaly detection method using a knowledge-distilled generative adversarial network. First, the proposed method employs collaborative teacher–student learning to model normal sample distributions, where the student network reconstructs input images as normal outputs while a discriminator identifies anomalies by comparing input and reconstructed images. Second, a multi-scale attention-coupling feature-enhancement mechanism is proposed, effectively integrating hierarchical semantic information with spatial-channel attention to achieve both precise target localization and robust background suppression in the teacher network. Third, an enhanced anomaly discriminator is designed to incorporate an enhanced pyramid upsampling module, through which fine-grained details are preserved via multi-level feature map aggregation, resulting in significantly improved sensitivity for small-sized anomaly detection. Finally, the proposed method achieved an AUC of 94.0%, an ACC of 92.5%, and an F1 score of 91.6% on the MNIST dataset, and an AUC of 94.7%, an ACC of 90.1%, and an F1 score of 87.8% on the railway fastener dataset, which proves the superior anomaly detection ability of this method. Full article
Show Figures

Figure 1

16 pages, 3751 KiB  
Article
Improved Face Image Super-Resolution Model Based on Generative Adversarial Network
by Qingyu Liu, Yeguo Sun, Lei Chen and Lei Liu
J. Imaging 2025, 11(5), 163; https://doi.org/10.3390/jimaging11050163 - 19 May 2025
Viewed by 635
Abstract
Image super-resolution (SR) models based on the generative adversarial network (GAN) face challenges such as unnatural facial detail restoration and local blurring. This paper proposes an improved GAN-based model to address these issues. First, a Multi-scale Hybrid Attention Residual Block (MHARB) is designed, [...] Read more.
Image super-resolution (SR) models based on the generative adversarial network (GAN) face challenges such as unnatural facial detail restoration and local blurring. This paper proposes an improved GAN-based model to address these issues. First, a Multi-scale Hybrid Attention Residual Block (MHARB) is designed, which dynamically enhances feature representation in critical face regions through dual-branch convolution and channel-spatial attention. Second, an Edge-guided Enhancement Block (EEB) is introduced, generating adaptive detail residuals by combining edge masks and channel attention to accurately recover high-frequency textures. Furthermore, a multi-scale discriminator with a weighted sub-discriminator loss is developed to balance global structural and local detail generation quality. Additionally, a phase-wise training strategy with dynamic adjustment of learning rate (Lr) and loss function weights is implemented to improve the realism of super-resolved face images. Experiments on the CelebA-HQ dataset demonstrate that the proposed model achieves a PSNR of 23.35 dB, a SSIM of 0.7424, and a LPIPS of 24.86, outperforming classical models and delivering superior visual quality in high-frequency regions. Notably, this model also surpasses the SwinIR model (PSNR: 23.28 dB → 23.35 dB, SSIM: 0.7340 → 0.7424, and LPIPS: 30.48 → 24.86), validating the effectiveness of the improved model and the training strategy in preserving facial details. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

26 pages, 11273 KiB  
Article
DREFNet: Deep Residual Enhanced Feature GAN for VVC Compressed Video Quality Improvement
by Tanni Das and Kiho Choi
Mathematics 2025, 13(10), 1609; https://doi.org/10.3390/math13101609 - 14 May 2025
Viewed by 382
Abstract
In recent years, the use of video content has experienced exponential growth. The rapid growth of video content has led to an increased reliance on various video codecs for efficient compression and transmission. However, several challenges are associated with codecs such as H.265/High [...] Read more.
In recent years, the use of video content has experienced exponential growth. The rapid growth of video content has led to an increased reliance on various video codecs for efficient compression and transmission. However, several challenges are associated with codecs such as H.265/High Efficiency Video Coding and H.266/Versatile Video Coding (VVC) that can impact video quality and performance. One significant challenge is the trade-off between compression efficiency and visual quality. While advanced codecs can significantly reduce file sizes, they introduce artifacts such as blocking, blurring, and color distortion, particularly in high-motion scenes. Different compression tools in modern video codecs are vital for minimizing artifacts that arise during the encoding and decoding processes. While the advanced algorithms used by these modern codecs can effectively decrease file sizes and enhance compression efficiency, they frequently find it challenging to eliminate artifacts entirely. By utilizing advanced techniques such as post-processing after the initial decoding, this method can significantly improve visual clarity and restore details that may have been compromised during compression. In this paper, we introduce a Deep Residual Enhanced Feature Generative Adversarial Network as a post-processing method aimed at further improving the quality of reconstructed frames from the advanced codec VVC. By utilizing the benefits of Deep Residual Blocks and Enhanced Feature Blocks, the generator network aims to make the reconstructed frame as similar as possible to the original frame. The discriminator network, a crucial element of our proposed method, plays a vital role in guiding the generator by evaluating the authenticity of generated frames. By distinguishing between fake and original frames, the discriminator enables the generator to improve the quality of its output. This feedback mechanism ensures that the generator learns to create more realistic frames, ultimately enhancing the overall performance of the model. The proposed method shows significant gain for Random Access (RA) and All Intra (AI) configurations while improving Video Multimethod Assessment Fusion (VMAF) and Multi-Scale Structural Similarity Index Measure (MS-SSIM). Considering VMAF, our proposed method can obtain 13.05% and 11.09% Bjøntegaard Delta Rate (BD-Rate) gain for RA and AI configuration, respectively. In the case of the luma component MS-SSIM, RA and AI configurations get, respectively, 5.00% and 5.87% BD-Rate gain after employing our suggested proposed network. Full article
(This article belongs to the Special Issue Intelligent Computing with Applications in Computer Vision)
Show Figures

Figure 1

25 pages, 5634 KiB  
Article
Dual-Domain Multi-Task Learning-Based Domain Adaptation for Hyperspectral Image Classification
by Qiusheng Chen, Zhuoqun Fang, Shizhuo Deng, Tong Jia, Zhaokui Li and Dongyue Chen
Remote Sens. 2025, 17(9), 1592; https://doi.org/10.3390/rs17091592 - 30 Apr 2025
Viewed by 449
Abstract
Enhancing target domain discriminability is a key focus in Unsupervised Domain Adaptation (UDA) for HyperSpectral Image (HSI) classification. However, existing methods overlook bringing similar cross-domain samples closer together in the feature space to achieve the indirect transfer of source domain classification knowledge. To [...] Read more.
Enhancing target domain discriminability is a key focus in Unsupervised Domain Adaptation (UDA) for HyperSpectral Image (HSI) classification. However, existing methods overlook bringing similar cross-domain samples closer together in the feature space to achieve the indirect transfer of source domain classification knowledge. To overcome this issue, we propose a Multi-Task Learning-based Domain Adaptation (MTLDA) method. MTLDA incorporates an inductive transfer mechanism into adversarial training, transferring the source classification knowledge to the target representation learning during the process of domain alignment. To enhance the target feature discriminability, we propose utilizing dual-domain contrastive learning to construct related tasks. A shared mapping network is employed to simultaneously perform Source domain supervised Contrastive Learning (SCL) and Target domain unsupervised Contrastive Learning (TCL), ensuring that similar samples across domains are positioned closely in the feature space, thereby improving the cross-scene HSI classification accuracy. Furthermore, we design a feature-level data augmentation method based on feature masking to assist contrastive learning tasks and generate more varied training data. Experimental results obtained from testing on three prominent HSI datasets demonstrate the MTLDA method’s superior efficacy in the realm of cross-scene HSI classification. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

16 pages, 41626 KiB  
Article
DAGANFuse: Infrared and Visible Image Fusion Based on Differential Features Attention Generative Adversarial Networks
by Yuxin Wen and Wen Liu
Appl. Sci. 2025, 15(8), 4560; https://doi.org/10.3390/app15084560 - 21 Apr 2025
Viewed by 607
Abstract
The purpose of multi-modal visual information fusion is to integrate the data of multi-sensors to generate an image with higher quality, more information, and greater clarity so that it contains more complementary information and fewer redundant features. Infrared sensors detect thermal radiation emitted [...] Read more.
The purpose of multi-modal visual information fusion is to integrate the data of multi-sensors to generate an image with higher quality, more information, and greater clarity so that it contains more complementary information and fewer redundant features. Infrared sensors detect thermal radiation emitted by objects, which is related to their temperature, whereas visible light sensors generate images by capturing the light that interacts with objects, including reflection, diffusion, and transmission. However, due to the different principles of infrared and visible light sensors, there is a large similarity difference between the generated infrared and visible images, which makes it difficult to extract complementary information. Existing methods generally use simple splicing or addition methods to fuse features at the fusion layer without considering the intrinsic features of different modal images and the interaction of features between different scales. Moreover, only correlation is considered. On the contrary, the image fusion task needs to pay more attention to their complementarity. For this reason, we introduce a cross-scale differential features attention generative adversarial fusion network, namely DAGANFuse. In the generator, we designed a cross-modal differential features attention module to fuse the intrinsic content of different modal images. We proposed a parallel path calculation of differential features and fusion features for attention weights and performed parallel spatial and channel attention weight calculations on the two paths. In the discriminator, a dual discriminator was used to maintain the information balance between different modalities and avoid common problems such as information blurring and loss of texture details. Experimental results show that our DAGANFuse has state-of-the-art (SOTA) performance and is superior to existing methods in terms of fusion performance. Full article
Show Figures

Figure 1

21 pages, 5892 KiB  
Article
Generating Large-Scale Origin–Destination Matrix via Progressive Growing Generative Adversarial Networks Model
by Zehao Yuan, Xuanyan Chen, Biyu Chen, Yubo Luo, Yu Zhang, Wenxin Teng and Chao Zhang
ISPRS Int. J. Geo-Inf. 2025, 14(4), 172; https://doi.org/10.3390/ijgi14040172 - 14 Apr 2025
Viewed by 703
Abstract
The origin–destination (OD) matrix describes traffic flow information between regions. It is a critical input for intelligent transportation systems (ITS). However, obtaining the OD matrix remains challenging due to high costs and privacy concerns. Synthetic data, which have the same statistical distribution of [...] Read more.
The origin–destination (OD) matrix describes traffic flow information between regions. It is a critical input for intelligent transportation systems (ITS). However, obtaining the OD matrix remains challenging due to high costs and privacy concerns. Synthetic data, which have the same statistical distribution of real data, help address privacy issues and data scarcity. Based on Generative Adversarial Networks (GAN), OD matrix generation models, which can effectively generate a synthetic OD matrix, help to address the challenge of obtaining OD matrix data in ITS research. However, existing OD matrix generation methods can only handle with tens of nodes. To address this challenge, this study proposes the Origin–Destination Progressive Growing Generative Adversarial Networks (OD-PGGAN) for large-scale OD matrix generation task which adapt the PGGAN architecture. OD-PGGAN adopts a progressive learning strategy to gradually learn the structure of the OD matrix from a coarse to fine scale. OD-PGGAN utilizes multi-scale generators and discriminators to perform generation and discrimination tasks at different spatial resolutions. OD-PGGAN introduces a geography-based upsampling and downsampling algorithm to maintain the geographical significance of the OD matrix during spatial resolution transformations. The results demonstrate that the proposed OD-PGGAN can generate a large-scale synthetic OD matrix with 1024 nodes that have the same distribution as the real sample and outperforms two classical methods. The OD-PGGAN can effectively provide reliable synthetic data for transportation applications. Full article
Show Figures

Figure 1

22 pages, 5152 KiB  
Article
Hyper-CycleGAN: A New Adversarial Neural Network Architecture for Cross-Domain Hyperspectral Data Generation
by Yibo He, Kah Phooi Seng, Li Minn Ang, Bei Peng and Xingyu Zhao
Appl. Sci. 2025, 15(8), 4188; https://doi.org/10.3390/app15084188 - 10 Apr 2025
Cited by 1 | Viewed by 956
Abstract
The scarcity of labeled training samples poses a significant challenge in hyperspectral image classification. Cross-scene classification has been shown to be an effective approach to tackle the problem of limited sample learning. This paper investigates the usage of generative adversarial networks (GANs) to [...] Read more.
The scarcity of labeled training samples poses a significant challenge in hyperspectral image classification. Cross-scene classification has been shown to be an effective approach to tackle the problem of limited sample learning. This paper investigates the usage of generative adversarial networks (GANs) to enable collaborative artificial intelligence learning on hyperspectral datasets. We propose and design a specialized architecture, termed Hyper-CycleGAN, for heterogeneous transfer learning across source and target scenes. This architecture enables the establishment of bidirectional mappings through efficient adversarial training and merges both source-to-target and target-to-source generators. The proposed Hyper-CycleGAN architecture harnesses the strengths of GANs, along with custom modifications like the integration of multi-scale attention mechanisms to enhance feature learning capabilities specifically tailored for hyperspectral data. To address training instability, the Wasserstein generative adversarial network with gradient penalty (WGAN-GP) loss discriminator is utilized. Additionally, a label smoothing technique is introduced to enhance the generalization capability of the generator, particularly in handling unlabeled samples, thus improving model robustness. Experimental results are performed to validate and confirm the effectiveness of the cross-domain Hyper-CycleGAN approach by demonstrating its applicability to two real-world cross-scene hyperspectral image datasets. Addressing the challenge of limited labeled samples in hyperspectral image classification, this research makes significant contributions and gives valuable insights for remote sensing, environmental monitoring, and medical imaging applications. Full article
Show Figures

Figure 1

17 pages, 1384 KiB  
Article
BCAMP: A Behavior-Controllable Motion Control Method Based on Adversarial Motion Priors for Quadruped Robot
by Yuzeng Peng, Zhaoyang Cai, Lei Zhang and Xiaohui Wang
Appl. Sci. 2025, 15(6), 3356; https://doi.org/10.3390/app15063356 - 19 Mar 2025
Viewed by 537
Abstract
In unpredictable scenarios, quadruped robots with behavior-controllable capabilities can often improve their adaptability through interaction with users. In this paper, we propose a behavior-controllable motion control method, integrating user commands with adversarial motion priors, enabling the quadruped robot to achieve behavior-controllable capabilities. Firstly, [...] Read more.
In unpredictable scenarios, quadruped robots with behavior-controllable capabilities can often improve their adaptability through interaction with users. In this paper, we propose a behavior-controllable motion control method, integrating user commands with adversarial motion priors, enabling the quadruped robot to achieve behavior-controllable capabilities. Firstly, a motion trajectory library is constructed to provide motion prior knowledge. To obtain stable trajectory data for various motions, optimal control methods are used to generate dynamic trajectories with whole-body dynamic constraints. These trajectory data are then standardized and assigned different weights, resulting in the construction of a motion trajectory library for the quadruped robot. Secondly, an adversarial motion prior network structure combined with user commands is proposed. Reward functions tailored to different motion behaviors are designed to achieve behavior control. This network structure acts as a single-motion prior discriminator, which, compared to a multi-motion prior discriminator, avoids complex architectures. Furthermore, the incorporation of user commands effectively addresses the issue where the single-motion prior discriminator struggles to clearly select actions as the dataset expands. Finally, simulations and comparative experiments are conducted to evaluate the effectiveness of the proposed method. Full article
Show Figures

Figure 1

22 pages, 5441 KiB  
Article
High-Dimensional Attention Generative Adversarial Network Framework for Underwater Image Enhancement
by Shasha Tian, Adisorn Sirikham, Jessada Konpang and Chuyang Wang
Electronics 2025, 14(6), 1203; https://doi.org/10.3390/electronics14061203 - 19 Mar 2025
Viewed by 450
Abstract
In recent years, underwater image enhancement (UIE) processing technology has developed rapidly, and underwater optical imaging technology has shown great advantages in the intelligent operation of underwater robots. In underwater environments, light absorption and scattering often cause seabed images to be blurry and [...] Read more.
In recent years, underwater image enhancement (UIE) processing technology has developed rapidly, and underwater optical imaging technology has shown great advantages in the intelligent operation of underwater robots. In underwater environments, light absorption and scattering often cause seabed images to be blurry and distorted in color. Therefore, acquiring high-definition underwater imagery with superior quality holds essential significance for advancing the exploration and development of marine resources. In order to resolve the problems associated with chromatic aberration, insufficient exposure, and blurring in underwater images, a high-dimensional attention generative adversarial network framework for underwater image enhancement (HDAGAN) is proposed. The introduced method is composed of a generator and a discriminator. The generator comprises an encoder and a decoder. In the encoder, a channel attention residual module (CARM) is designed to capture both semantic features and contextual details from visual data, incorporating multi-scale feature extraction layers and multi-scale feature fusion layers. Furthermore, in the decoder, to refine the feature representation of latent vectors for detail recovery, a strengthen–operate–subtract module (SOSM) is introduced to strengthen the model’s capability to comprehend the picture’s geometric structure and semantic information. Additionally, in the discriminator, a multi-scale feature discrimination module (MFDM) is proposed, which aids in achieving more precise discrimination. Experimental findings demonstrate that the novel approach significantly outperforms state-of-the-art UIE techniques, delivering enhanced outcomes with higher visual appeal. Full article
(This article belongs to the Special Issue Artificial Intelligence in Graphics and Images)
Show Figures

Figure 1

20 pages, 3968 KiB  
Article
Research on Multi-Scale Point Cloud Completion Method Based on Local Neighborhood Dynamic Fusion
by Yalun Liu, Jiantao Sun and Ling Zhao
Appl. Sci. 2025, 15(6), 3006; https://doi.org/10.3390/app15063006 - 10 Mar 2025
Viewed by 1046
Abstract
Point cloud completion reconstructs incomplete, sparse inputs into complete 3D shapes. However, in the current 3D completion task, it is difficult to effectively extract the local details of an incomplete one, resulting in poor restoration of local details and low accuracy of the [...] Read more.
Point cloud completion reconstructs incomplete, sparse inputs into complete 3D shapes. However, in the current 3D completion task, it is difficult to effectively extract the local details of an incomplete one, resulting in poor restoration of local details and low accuracy of the completed point clouds. To address this problem, this paper proposes a multi-scale point cloud completion method based on local neighborhood dynamic fusion (LNDF: adaptive aggregation of multi-scale local features through dynamic range and weight adjustment). Firstly, the farthest point sampling (FPS) strategy is applied to the original incomplete and defective point clouds for down-sampling to obtain three types of point clouds at different scales. When extracting features from point clouds of different scales, the local neighborhood aggregation of key points is dynamically adjusted, and the Transformer architecture is integrated to further enhance the correlation of local feature extraction information. Secondly, by combining the method of generating point clouds layer by layer in a pyramid-like manner, the local details of the point clouds are gradually enriched from coarse to fine to achieve point cloud completion. Finally, when designing the decoder, inspired by the concept of generative adversarial networks (GANs), an attention discriminator designed in series with a feature extraction layer and an attention layer is added to further optimize the completion performance of the network. Experimental results show that LNDM-Net reduces the average Chamfer Distance (CD) by 5.78% on PCN and 4.54% on ShapeNet compared to SOTA. The visualization of completion results demonstrates the superior performance of our method in both point cloud completion accuracy and local detail preservation. When handling diverse samples and incomplete point clouds in real-world 3D scenarios from the KITTI dataset, the approach exhibits enhanced generalization capability and completion fidelity. Full article
(This article belongs to the Special Issue Advanced Pattern Recognition & Computer Vision)
Show Figures

Figure 1

Back to TopTop