Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,293)

Search Parameters:
Keywords = global–local features

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 4847 KiB  
Article
FCA-STNet: Spatiotemporal Growth Prediction and Phenotype Extraction from Image Sequences for Cotton Seedlings
by Yiping Wan, Bo Han, Pengyu Chu, Qiang Guo and Jingjing Zhang
Plants 2025, 14(15), 2394; https://doi.org/10.3390/plants14152394 (registering DOI) - 2 Aug 2025
Abstract
To address the limitations of the existing cotton seedling growth prediction methods in field environments, specifically, poor representation of spatiotemporal features and low visual fidelity in texture rendering, this paper proposes an algorithm for the prediction of cotton seedling growth from images based [...] Read more.
To address the limitations of the existing cotton seedling growth prediction methods in field environments, specifically, poor representation of spatiotemporal features and low visual fidelity in texture rendering, this paper proposes an algorithm for the prediction of cotton seedling growth from images based on FCA-STNet. The model leverages historical sequences of cotton seedling RGB images to generate an image of the predicted growth at time t + 1 and extracts 37 phenotypic traits from the predicted image. A novel STNet structure is designed to enhance the representation of spatiotemporal dependencies, while an Adaptive Fine-Grained Channel Attention (FCA) module is integrated to capture both global and local feature information. This attention mechanism focuses on individual cotton plants and their textural characteristics, effectively reducing the interference from common field-related challenges such as insufficient lighting, leaf fluttering, and wind disturbances. The experimental results demonstrate that the predicted images achieved an MSE of 0.0086, MAE of 0.0321, SSIM of 0.8339, and PSNR of 20.7011 on the test set, representing improvements of 2.27%, 0.31%, 4.73%, and 11.20%, respectively, over the baseline STNet. The method outperforms several mainstream spatiotemporal prediction models. Furthermore, the majority of the predicted phenotypic traits exhibited correlations with actual measurements with coefficients above 0.8, indicating high prediction accuracy. The proposed FCA-STNet model enables visually realistic prediction of cotton seedling growth in open-field conditions, offering a new perspective for research in growth prediction. Full article
(This article belongs to the Special Issue Advances in Artificial Intelligence for Plant Research)
Show Figures

Figure 1

23 pages, 3120 KiB  
Article
Bee Swarm Metropolis–Hastings Sampling for Bayesian Inference in the Ginzburg–Landau Equation
by Shucan Xia and Lipu Zhang
Algorithms 2025, 18(8), 476; https://doi.org/10.3390/a18080476 (registering DOI) - 2 Aug 2025
Abstract
To improve the sampling efficiency of Markov Chain Monte Carlo in complex parameter spaces, this paper proposes an adaptive sampling method that integrates a swarm intelligence mechanism called the BeeSwarm-MH algorithm. The method combines global exploration by scout bees with local exploitation by [...] Read more.
To improve the sampling efficiency of Markov Chain Monte Carlo in complex parameter spaces, this paper proposes an adaptive sampling method that integrates a swarm intelligence mechanism called the BeeSwarm-MH algorithm. The method combines global exploration by scout bees with local exploitation by worker bees. It employs multi-stage perturbation intensities and adaptive step-size tuning to enable efficient posterior sampling. Focusing on Bayesian inference for parameter estimation in the soliton solutions of the two-dimensional complex Ginzburg–Landau equation, we design a dedicated inference framework to systematically compare the performance of BeeSwarm-MH with the classical Metropolis–Hastings algorithm. Experimental results demonstrate that BeeSwarm-MH achieves comparable estimation accuracy while significantly reducing the required number of iterations and total computation time for convergence. Moreover, it exhibits superior global search capabilities and adaptive features, offering a practical approach for efficient Bayesian inference in complex physical models. Full article
37 pages, 9057 KiB  
Review
Palaeoclimatic Geoheritage in the Age of Climate Change: Educational Use of the Pleistocene Glacial and Periglacial Geodiversity
by Paweł Wolniewicz and Maria Górska-Zabielska
Geosciences 2025, 15(8), 294; https://doi.org/10.3390/geosciences15080294 (registering DOI) - 2 Aug 2025
Abstract
The lithological record of past climates and climate changes reveals significant potential in enhancing education and understanding of global climate changes and their impacts on contemporary societies. A relatively young geological record of Pleistocene cooling and glaciations serves as one of the most [...] Read more.
The lithological record of past climates and climate changes reveals significant potential in enhancing education and understanding of global climate changes and their impacts on contemporary societies. A relatively young geological record of Pleistocene cooling and glaciations serves as one of the most useful geo-educational tools. The present study encompasses a comprehensive review of ongoing efforts to assess and communicate the glacial geoheritage of the Pleistocene, with a detailed case study of Poland. A literature review is conducted to evaluate the extent of scientific work on inventorying and communicating the geodiversity of Pleistocene glacial and periglacial environments globally. The study demonstrates a steady increase in the number of scientific contributions focused on the evaluation and promotion of Pleistocene geoheritage, with a notable transition from the description of geosites to the establishment of geoconservation practices and educational strategies. The relative complexity of the palaeoclimatic record and the presence of glacial geodiversity features across extensive areas indicate that effective scientific communication of climate changes requires careful selection of a limited number of geodiversity elements and sediment types. In this context, the use of glacial erratic boulders and rock gardens for promotion of Pleistocene glacial geoheritage is advocated, and the significance of educational initiatives for local communities and the preservation of geocultural heritage is outlined in detail. Full article
(This article belongs to the Special Issue Challenges and Research Trends of Geoheritage and Geoconservation)
Show Figures

Figure 1

17 pages, 2076 KiB  
Article
Detection and Classification of Power Quality Disturbances Based on Improved Adaptive S-Transform and Random Forest
by Dongdong Yang, Shixuan Lü, Junming Wei, Lijun Zheng and Yunguang Gao
Energies 2025, 18(15), 4088; https://doi.org/10.3390/en18154088 (registering DOI) - 1 Aug 2025
Abstract
The increasing penetration of renewable energy into power systems has intensified transient power quality (PQ) disturbances, demanding efficient detection and classification methods to enable timely operational decisions. This paper introduces a hybrid framework combining an Improved Adaptive S-Transform (IAST) with a Random Forest [...] Read more.
The increasing penetration of renewable energy into power systems has intensified transient power quality (PQ) disturbances, demanding efficient detection and classification methods to enable timely operational decisions. This paper introduces a hybrid framework combining an Improved Adaptive S-Transform (IAST) with a Random Forest (RF) classifier to address these challenges. The IAST employs a globally adaptive Gaussian window as its kernel function, which automatically adjusts window length and spectral resolution based on real-time frequency characteristics, thereby enhancing time–frequency localization accuracy while reducing algorithmic complexity. To optimize computational efficiency, window parameters are determined through an energy concentration maximization criterion, enabling rapid extraction of discriminative features from diverse PQ disturbances (e.g., voltage sags and transient interruptions). These features are then fed into an RF classifier, which simultaneously mitigates model variance and bias, achieving robust classification. Experimental results show that the proposed IAST–RF method achieves a classification accuracy of 99.73%, demonstrating its potential for real-time PQ monitoring in modern grids with high renewable energy penetration. Full article
Show Figures

Figure 1

24 pages, 29785 KiB  
Article
Multi-Scale Feature Extraction with 3D Complex-Valued Network for PolSAR Image Classification
by Nana Jiang, Wenbo Zhao, Jiao Guo, Qiang Zhao and Jubo Zhu
Remote Sens. 2025, 17(15), 2663; https://doi.org/10.3390/rs17152663 (registering DOI) - 1 Aug 2025
Abstract
Compared to traditional real-valued neural networks, which process only amplitude information, complex-valued neural networks handle both amplitude and phase information, leading to superior performance in polarimetric synthetic aperture radar (PolSAR) image classification tasks. This paper proposes a multi-scale feature extraction (MSFE) method based [...] Read more.
Compared to traditional real-valued neural networks, which process only amplitude information, complex-valued neural networks handle both amplitude and phase information, leading to superior performance in polarimetric synthetic aperture radar (PolSAR) image classification tasks. This paper proposes a multi-scale feature extraction (MSFE) method based on a 3D complex-valued network to improve classification accuracy by fully leveraging multi-scale features, including phase information. We first designed a complex-valued three-dimensional network framework combining complex-valued 3D convolution (CV-3DConv) with complex-valued squeeze-and-excitation (CV-SE) modules. This framework is capable of simultaneously capturing spatial and polarimetric features, including both amplitude and phase information, from PolSAR images. Furthermore, to address robustness degradation from limited labeled samples, we introduced a multi-scale learning strategy that jointly models global and local features. Specifically, global features extract overall semantic information, while local features help the network capture region-specific semantics. This strategy enhances information utilization by integrating multi-scale receptive fields, complementing feature advantages. Extensive experiments on four benchmark datasets demonstrated that the proposed method outperforms various comparison methods, maintaining high classification accuracy across different sampling rates, thus validating its effectiveness and robustness. Full article
Show Figures

Figure 1

24 pages, 23817 KiB  
Article
Dual-Path Adversarial Denoising Network Based on UNet
by Jinchi Yu, Yu Zhou, Mingchen Sun and Dadong Wang
Sensors 2025, 25(15), 4751; https://doi.org/10.3390/s25154751 (registering DOI) - 1 Aug 2025
Abstract
Digital image quality is crucial for reliable analysis in applications such as medical imaging, satellite remote sensing, and video surveillance. However, traditional denoising methods struggle to balance noise removal with detail preservation and lack adaptability to various types of noise. We propose a [...] Read more.
Digital image quality is crucial for reliable analysis in applications such as medical imaging, satellite remote sensing, and video surveillance. However, traditional denoising methods struggle to balance noise removal with detail preservation and lack adaptability to various types of noise. We propose a novel three-module architecture for image denoising, comprising a generator, a dual-path-UNet-based denoiser, and a discriminator. The generator creates synthetic noise patterns to augment training data, while the dual-path-UNet denoiser uses multiple receptive field modules to preserve fine details and dense feature fusion to maintain global structural integrity. The discriminator provides adversarial feedback to enhance denoising performance. This dual-path adversarial training mechanism addresses the limitations of traditional methods by simultaneously capturing both local details and global structures. Experiments on the SIDD, DND, and PolyU datasets demonstrate superior performance. We compare our architecture with the latest state-of-the-art GAN variants through comprehensive qualitative and quantitative evaluations. These results confirm the effectiveness of noise removal with minimal loss of critical image details. The proposed architecture enhances image denoising capabilities in complex noise scenarios, providing a robust solution for applications that require high image fidelity. By enhancing adaptability to various types of noise while maintaining structural integrity, this method provides a versatile tool for image processing tasks that require preserving detail. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

18 pages, 11340 KiB  
Article
CLSANet: Cognitive Learning-Based Self-Adaptive Feature Fusion for Multimodal Visual Object Detection
by Han Peng, Qionglin Liu, Riqing Ruan, Shuaiqi Yuan and Qin Li
Electronics 2025, 14(15), 3082; https://doi.org/10.3390/electronics14153082 (registering DOI) - 1 Aug 2025
Abstract
Multimodal object detection leverages the complementary characteristics of visible (RGB) and infrared (IR) imagery, making it well-suited for challenging scenarios such as low illumination, occlusion, and complex backgrounds. However, most existing fusion-based methods rely on static or heuristic strategies, limiting their adaptability to [...] Read more.
Multimodal object detection leverages the complementary characteristics of visible (RGB) and infrared (IR) imagery, making it well-suited for challenging scenarios such as low illumination, occlusion, and complex backgrounds. However, most existing fusion-based methods rely on static or heuristic strategies, limiting their adaptability to dynamic environments. To address this limitation, we propose CLSANet, a cognitive learning-based self-adaptive network that enhances detection performance by dynamically selecting and integrating modality-specific features. CLSANet consists of three key modules: (1) a Dominant Modality Identification Module that selects the most informative modality based on global scene analysis; (2) a Modality Enhancement Module that disentangles and strengthens shared and modality-specific representations; and (3) a Self-Adaptive Fusion Module that adjusts fusion weights spatially according to local scene complexity. Compared to existing methods, CLSANet achieves state-of-the-art detection performance with significantly fewer parameters and lower computational cost. Ablation studies further demonstrate the individual effectiveness of each module under different environmental conditions, particularly in low-light and occluded scenes. CLSANet offers a compact, interpretable, and practical solution for multimodal object detection in resource-constrained settings. Full article
(This article belongs to the Special Issue Digital Intelligence Technology and Applications)
Show Figures

Figure 1

26 pages, 1790 KiB  
Article
A Hybrid Deep Learning Model for Aromatic and Medicinal Plant Species Classification Using a Curated Leaf Image Dataset
by Shareena E. M., D. Abraham Chandy, Shemi P. M. and Alwin Poulose
AgriEngineering 2025, 7(8), 243; https://doi.org/10.3390/agriengineering7080243 - 1 Aug 2025
Abstract
In the era of smart agriculture, accurate identification of plant species is critical for effective crop management, biodiversity monitoring, and the sustainable use of medicinal resources. However, existing deep learning approaches often underperform when applied to fine-grained plant classification tasks due to the [...] Read more.
In the era of smart agriculture, accurate identification of plant species is critical for effective crop management, biodiversity monitoring, and the sustainable use of medicinal resources. However, existing deep learning approaches often underperform when applied to fine-grained plant classification tasks due to the lack of domain-specific, high-quality datasets and the limited representational capacity of traditional architectures. This study addresses these challenges by introducing a novel, well-curated leaf image dataset consisting of 39 classes of medicinal and aromatic plants collected from the Aromatic and Medicinal Plant Research Station in Odakkali, Kerala, India. To overcome performance bottlenecks observed with a baseline Convolutional Neural Network (CNN) that achieved only 44.94% accuracy, we progressively enhanced model performance through a series of architectural innovations. These included the use of a pre-trained VGG16 network, data augmentation techniques, and fine-tuning of deeper convolutional layers, followed by the integration of Squeeze-and-Excitation (SE) attention blocks. Ultimately, we propose a hybrid deep learning architecture that combines VGG16 with Batch Normalization, Gated Recurrent Units (GRUs), Transformer modules, and Dilated Convolutions. This final model achieved a peak validation accuracy of 95.24%, significantly outperforming several baseline models, such as custom CNN (44.94%), VGG-19 (59.49%), VGG-16 before augmentation (71.52%), Xception (85.44%), Inception v3 (87.97%), VGG-16 after data augumentation (89.24%), VGG-16 after fine-tuning (90.51%), MobileNetV2 (93.67), and VGG16 with SE block (94.94%). These results demonstrate superior capability in capturing both local textures and global morphological features. The proposed solution not only advances the state of the art in plant classification but also contributes a valuable dataset to the research community. Its real-world applicability spans field-based plant identification, biodiversity conservation, and precision agriculture, offering a scalable tool for automated plant recognition in complex ecological and agricultural environments. Full article
(This article belongs to the Special Issue Implementation of Artificial Intelligence in Agriculture)
Show Figures

Figure 1

25 pages, 10331 KiB  
Article
Forest Fire Detection Method Based on Dual-Branch Multi-Scale Adaptive Feature Fusion Network
by Qinggan Wu, Chen Wei, Ning Sun, Xiong Xiong, Qingfeng Xia, Jianmeng Zhou and Xingyu Feng
Forests 2025, 16(8), 1248; https://doi.org/10.3390/f16081248 - 31 Jul 2025
Abstract
There are significant scale and morphological differences between fire and smoke features in forest fire detection. This paper proposes a detection method based on dual-branch multi-scale adaptive feature fusion network (DMAFNet). In this method, convolutional neural network (CNN) and transformer are used to [...] Read more.
There are significant scale and morphological differences between fire and smoke features in forest fire detection. This paper proposes a detection method based on dual-branch multi-scale adaptive feature fusion network (DMAFNet). In this method, convolutional neural network (CNN) and transformer are used to form a dual-branch backbone network to extract local texture and global context information, respectively. In order to overcome the difference in feature distribution and response scale between the two branches, a feature correction module (FCM) is designed. Through space and channel correction mechanisms, the adaptive alignment of two branch features is realized. The Fusion Feature Module (FFM) is further introduced to fully integrate dual-branch features based on the two-way cross-attention mechanism and effectively suppress redundant information. Finally, the Multi-Scale Fusion Attention Unit (MSFAU) is designed to enhance the multi-scale detection capability of fire targets. Experimental results show that the proposed DMAFNet has significantly improved in mAP (mean average precision) indicators compared with existing mainstream detection methods. Full article
(This article belongs to the Section Natural Hazards and Risk Management)
Show Figures

Figure 1

29 pages, 15488 KiB  
Article
GOFENet: A Hybrid Transformer–CNN Network Integrating GEOBIA-Based Object Priors for Semantic Segmentation of Remote Sensing Images
by Tao He, Jianyu Chen and Delu Pan
Remote Sens. 2025, 17(15), 2652; https://doi.org/10.3390/rs17152652 (registering DOI) - 31 Jul 2025
Viewed by 43
Abstract
Geographic object-based image analysis (GEOBIA) has demonstrated substantial utility in remote sensing tasks. However, its integration with deep learning remains largely confined to image-level classification. This is primarily due to the irregular shapes and fragmented boundaries of segmented objects, which limit its applicability [...] Read more.
Geographic object-based image analysis (GEOBIA) has demonstrated substantial utility in remote sensing tasks. However, its integration with deep learning remains largely confined to image-level classification. This is primarily due to the irregular shapes and fragmented boundaries of segmented objects, which limit its applicability in semantic segmentation. While convolutional neural networks (CNNs) excel at local feature extraction, they inherently struggle to capture long-range dependencies. In contrast, Transformer-based models are well suited for global context modeling but often lack fine-grained local detail. To overcome these limitations, we propose GOFENet (Geo-Object Feature Enhanced Network)—a hybrid semantic segmentation architecture that effectively fuses object-level priors into deep feature representations. GOFENet employs a dual-encoder design combining CNN and Swin Transformer architectures, enabling multi-scale feature fusion through skip connections to preserve both local and global semantics. An auxiliary branch incorporating cascaded atrous convolutions is introduced to inject information of segmented objects into the learning process. Furthermore, we develop a cross-channel selection module (CSM) for refined channel-wise attention, a feature enhancement module (FEM) to merge global and local representations, and a shallow–deep feature fusion module (SDFM) to integrate pixel- and object-level cues across scales. Experimental results on the GID and LoveDA datasets demonstrate that GOFENet achieves superior segmentation performance, with 66.02% mIoU and 51.92% mIoU, respectively. The model exhibits strong capability in delineating large-scale land cover features, producing sharper object boundaries and reducing classification noise, while preserving the integrity and discriminability of land cover categories. Full article
Show Figures

Figure 1

13 pages, 11739 KiB  
Article
DeepVinci: Organ and Tool Segmentation with Edge Supervision and a Densely Multi-Scale Pyramid Module for Robot-Assisted Surgery
by Li-An Tseng, Yuan-Chih Tsai, Meng-Yi Bai, Mei-Fang Li, Yi-Liang Lee, Kai-Jo Chiang, Yu-Chi Wang and Jing-Ming Guo
Diagnostics 2025, 15(15), 1917; https://doi.org/10.3390/diagnostics15151917 - 30 Jul 2025
Viewed by 167
Abstract
Background: Automated surgical navigation can be separated into three stages: (1) organ identification and localization, (2) identification of the organs requiring further surgery, and (3) automated planning of the operation path and steps. With its ideal visual and operating system, the da [...] Read more.
Background: Automated surgical navigation can be separated into three stages: (1) organ identification and localization, (2) identification of the organs requiring further surgery, and (3) automated planning of the operation path and steps. With its ideal visual and operating system, the da Vinci surgical system provides a promising platform for automated surgical navigation. This study focuses on the first step in automated surgical navigation by identifying organs in gynecological surgery. Methods: Due to the difficulty of collecting da Vinci gynecological endoscopy data, we propose DeepVinci, a novel end-to-end high-performance encoder–decoder network based on convolutional neural networks (CNNs) for pixel-level organ semantic segmentation. Specifically, to overcome the drawback of a limited field of view, we incorporate a densely multi-scale pyramid module and feature fusion module, which can also enhance the global context information. In addition, the system integrates an edge supervision network to refine the segmented results on the decoding side. Results: Experimental results show that DeepVinci can achieve state-of-the-art accuracy, obtaining dice similarity coefficient and mean pixel accuracy values of 0.684 and 0.700, respectively. Conclusions: The proposed DeepVinci network presents a practical and competitive semantic segmentation solution for da Vinci gynecological surgery. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

19 pages, 7161 KiB  
Article
Dynamic Snake Convolution Neural Network for Enhanced Image Super-Resolution
by Weiqiang Xin, Ziang Wu, Qi Zhu, Tingting Bi, Bing Li and Chunwei Tian
Mathematics 2025, 13(15), 2457; https://doi.org/10.3390/math13152457 - 30 Jul 2025
Viewed by 118
Abstract
Image super-resolution (SR) is essential for enhancing image quality in critical applications, such as medical imaging and satellite remote sensing. However, existing methods were often limited in their ability to effectively process and integrate multi-scales information from fine textures to global structures. To [...] Read more.
Image super-resolution (SR) is essential for enhancing image quality in critical applications, such as medical imaging and satellite remote sensing. However, existing methods were often limited in their ability to effectively process and integrate multi-scales information from fine textures to global structures. To address these limitations, this paper proposes DSCNN, a dynamic snake convolution neural network for enhanced image super-resolution. DSCNN optimizes feature extraction and network architecture to enhance both performance and efficiency: To improve feature extraction, the core innovation is a feature extraction and enhancement module with dynamic snake convolution that dynamically adjusts the convolution kernel’s shape and position to better fit the image’s geometric structures, significantly improving feature extraction. To optimize the network’s structure, DSCNN employs an enhanced residual network framework. This framework utilizes parallel convolutional layers and a global feature fusion mechanism to further strengthen feature extraction capability and gradient flow efficiency. Additionally, the network incorporates a SwishReLU-based activation function and a multi-scale convolutional concatenation structure. This multi-scale design effectively captures both local details and global image structure, enhancing SR reconstruction. In summary, the proposed DSCNN outperforms existing methods in both objective metrics and visual perception (e.g., our method achieved optimal PSNR and SSIM results on the Set5 ×4 dataset). Full article
(This article belongs to the Special Issue Structural Networks for Image Application)
Show Figures

Figure 1

22 pages, 12983 KiB  
Article
A Hybrid Model for Fluorescein Funduscopy Image Classification by Fusing Multi-Scale Context-Aware Features
by Yawen Wang, Chao Chen, Zhuo Chen and Lingling Wu
Technologies 2025, 13(8), 323; https://doi.org/10.3390/technologies13080323 (registering DOI) - 30 Jul 2025
Viewed by 84
Abstract
With the growing use of deep learning in medical image analysis, automated classification of fundus images is crucial for the early detection of fundus diseases. However, the complexity of fluorescein fundus angiography (FFA) images poses challenges in the accurate identification of lesions. To [...] Read more.
With the growing use of deep learning in medical image analysis, automated classification of fundus images is crucial for the early detection of fundus diseases. However, the complexity of fluorescein fundus angiography (FFA) images poses challenges in the accurate identification of lesions. To address these issues, we propose the Enhanced Feature Fusion ConvNeXt (EFF-ConvNeXt) model, a novel architecture combining VGG16 and an enhanced ConvNeXt for FFA image classification. VGG16 is employed to extract edge features, while an improved ConvNeXt incorporates the Context-Aware Feature Fusion (CAFF) strategy to enhance global contextual understanding. CAFF integrates an Improved Global Context (IGC) module with multi-scale feature fusion to jointly capture local and global features. Furthermore, an SKNet module is used in the final stages to adaptively recalibrate channel-wise features. The model demonstrates improved classification accuracy and robustness, achieving 92.50% accuracy and 92.30% F1 score on the APTOS2023 dataset—surpassing the baseline ConvNeXt-T by 3.12% in accuracy and 4.01% in F1 score. These results highlight the model’s ability to better recognize complex disease features, providing significant support for more accurate diagnosis of fundus diseases. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Medical Image Analysis)
Show Figures

Figure 1

40 pages, 7941 KiB  
Article
Synergistic Hierarchical AI Framework for USV Navigation: Closing the Loop Between Swin-Transformer Perception, T-ASTAR Planning, and Energy-Aware TD3 Control
by Haonan Ye, Hongjun Tian, Qingyun Wu, Yihong Xue, Jiayu Xiao, Guijie Liu and Yang Xiong
Sensors 2025, 25(15), 4699; https://doi.org/10.3390/s25154699 - 30 Jul 2025
Viewed by 249
Abstract
Autonomous Unmanned Surface Vehicle (USV) operations in complex ocean engineering scenarios necessitate robust navigation, guidance, and control technologies. These systems require reliable sensor-based object detection and efficient, safe, and energy-aware path planning. To address these multifaceted challenges, this paper proposes a novel synergistic [...] Read more.
Autonomous Unmanned Surface Vehicle (USV) operations in complex ocean engineering scenarios necessitate robust navigation, guidance, and control technologies. These systems require reliable sensor-based object detection and efficient, safe, and energy-aware path planning. To address these multifaceted challenges, this paper proposes a novel synergistic AI framework. The framework integrates (1) a novel adaptation of the Swin-Transformer to generate a dense, semantic risk map from raw visual data, enabling the system to interpret ambiguous marine conditions like sun glare and choppy water, enabling real-time environmental understanding crucial for guidance; (2) a Transformer-enhanced A-star (T-ASTAR) algorithm with spatio-temporal attentional guidance to generate globally near-optimal and energy-aware static paths; (3) a domain-adapted TD3 agent featuring a novel energy-aware reward function that optimizes for USV hydrodynamic constraints, making it suitable for long-endurance missions tailored for USVs to perform dynamic local path optimization and real-time obstacle avoidance, forming a key control element; and (4) CUDA acceleration to meet the computational demands of real-time ocean engineering applications. Simulations and real-world data verify the framework’s superiority over benchmarks like A* and RRT, achieving 30% shorter routes, 70% fewer turns, 64.7% fewer dynamic collisions, and a 215-fold speed improvement in map generation via CUDA acceleration. This research underscores the importance of integrating powerful AI components within a hierarchical synergy, encompassing AI-based perception, hierarchical decision planning for guidance, and multi-stage optimal search algorithms for control. The proposed solution significantly advances USV autonomy, addressing critical ocean engineering challenges such as navigation in dynamic environments, object avoidance, and energy-constrained operations for unmanned maritime systems. Full article
Show Figures

Figure 1

21 pages, 2267 KiB  
Article
Dual-Branch Network for Blind Quality Assessment of Stereoscopic Omnidirectional Images: A Spherical and Perceptual Feature Integration Approach
by Zhe Wang, Yi Liu and Yang Song
Electronics 2025, 14(15), 3035; https://doi.org/10.3390/electronics14153035 - 30 Jul 2025
Viewed by 114
Abstract
Stereoscopic omnidirectional images (SOIs) have gained significant attention for their immersive viewing experience by providing binocular depth with panoramic scenes. However, evaluating their visual quality remains challenging due to its unique spherical geometry, binocular disparity, and viewing conditions. To address these challenges, this [...] Read more.
Stereoscopic omnidirectional images (SOIs) have gained significant attention for their immersive viewing experience by providing binocular depth with panoramic scenes. However, evaluating their visual quality remains challenging due to its unique spherical geometry, binocular disparity, and viewing conditions. To address these challenges, this paper proposes a dual-branch deep learning framework that integrates spherical structural features and perceptual binocular cues to assess the quality of SOIs without reference. Specifically, the global branch leverages spherical convolutions to capture wide-range spatial distortions, while the local branch utilizes a binocular difference module based on discrete wavelet transform to extract depth-aware perceptual information. A feature complementarity module is introduced to fuse global and local representations for final quality prediction. Experimental evaluations on two public SOIQA datasets—NBU-SOID and SOLID—demonstrate that the proposed method achieves state-of-the-art performance, with PLCC/SROCC values of 0.926/0.918 and 0.918/0.891, respectively. These results validate the effectiveness and robustness of our approach in stereoscopic omnidirectional image quality assessment tasks. Full article
(This article belongs to the Special Issue AI in Signal and Image Processing)
Show Figures

Figure 1

Back to TopTop