Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (692)

Search Parameters:
Keywords = global contextual information

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 3772 KB  
Article
A Degradation-Aware Dual-Path Network with Spatially Adaptive Attention for Underwater Image Enhancement
by Shasha Tian, Adisorn Sirikham, Jessada Konpang and Chuyang Wang
Electronics 2026, 15(2), 435; https://doi.org/10.3390/electronics15020435 - 19 Jan 2026
Viewed by 91
Abstract
Underwater image enhancement remains challenging due to wavelength-dependent absorption, spatially varying scattering, and non-uniform illumination, which jointly cause severe color distortion, contrast degradation, and structural information loss. To address these issues, we propose UCS-Net, a degradation-aware dual-path framework that exploits the complementarity between [...] Read more.
Underwater image enhancement remains challenging due to wavelength-dependent absorption, spatially varying scattering, and non-uniform illumination, which jointly cause severe color distortion, contrast degradation, and structural information loss. To address these issues, we propose UCS-Net, a degradation-aware dual-path framework that exploits the complementarity between global and local representations. A spatial color balance module first stabilizes the chromatic distribution of degraded inputs through a learnable gray-world-guided normalization, mitigating wavelength-induced color bias prior to feature extraction. The network then adopts a dual-branch architecture, where a hierarchical Swin Transformer branch models long-range contextual dependencies and global color relationships, while a multi-scale residual convolutional branch focuses on recovering local textures and structural details suppressed by scattering. Furthermore, a multi-scale attention fusion mechanism adaptively integrates features from both branches in a degradation-aware manner, enabling dynamic emphasis on global or local cues according to regional attenuation severity. A hue-preserving reconstruction module is finally employed to suppress color artifacts and ensure faithful color rendition. Extensive experiments on UIEB, EUVP, and UFO benchmarks demonstrate that UCS-Net consistently outperforms state-of-the-art methods in both full-reference and non-reference evaluations. Qualitative results further confirm its effectiveness in restoring fine structural details while maintaining globally consistent and visually realistic colors across diverse underwater scenes. Full article
(This article belongs to the Special Issue Image Processing and Analysis)
Show Figures

Figure 1

40 pages, 1827 KB  
Article
Leveraging Blockchain and Digital Twins for Low-Carbon, Circular Supply Chains: Evidence from the Moroccan Manufacturing Sector
by Soukaina Abdallah-Ou-Moussa, Martin Wynn and Zakaria Rouaine
Sustainability 2026, 18(2), 991; https://doi.org/10.3390/su18020991 - 18 Jan 2026
Viewed by 267
Abstract
As global supply chains face increasing pressure to reconcile economic efficiency, environmental responsibility, and ethical transparency, emerging digital technologies offer unprecedented opportunities for sustainable transformation. This article examines this dynamic in the context of the Moroccan industrial sector, with particular reference to blockchain [...] Read more.
As global supply chains face increasing pressure to reconcile economic efficiency, environmental responsibility, and ethical transparency, emerging digital technologies offer unprecedented opportunities for sustainable transformation. This article examines this dynamic in the context of the Moroccan industrial sector, with particular reference to blockchain and digital twin technologies. The study employs a rigorous mixed-methods design, combining an in-depth qualitative exploration with 30 industry professionals and a Partial Least Squares Structural Equation Modeling (PLS-SEM) model based on survey data from 125 Moroccan manufacturing firms. The findings highlight the synergistic contribution of blockchain and digital twins in enabling circular, low-carbon, and resilient supply chains. Blockchain adoption strengthens environmental impact traceability, data reliability, and responsible governance, while digital twin systems enhance eco-efficiency through real-time modeling and predictive flow simulation. Circular integration emerges as a critical enabler, significantly amplifying the positive effects of both technologies by aligning physical and informational flows within closed-loop processes. With its strong empirical grounding and contextual relevance to an emerging economy, this research provides actionable insights for policymakers, industrial managers, and supply chain practitioners committed to accelerating the sustainable transformation of production systems. It also offers a renewed understanding of how digitalization and circularity jointly support environmental performance within industrial ecosystems. Full article
(This article belongs to the Topic Sustainable Supply Chain Practices in A Digital Age)
Show Figures

Figure 1

26 pages, 7951 KB  
Article
VIIRS Nightfire Super-Resolution Method for Multiyear Cataloging of Natural Gas Flaring Sites: 2012-2025
by Mikhail Zhizhin, Christopher D. Elvidge, Tilottama Ghosh, Gregory Gleason and Morgan Bazilian
Remote Sens. 2026, 18(2), 314; https://doi.org/10.3390/rs18020314 - 16 Jan 2026
Viewed by 158
Abstract
We present a new method for mapping global gas flaring using a multiyear spatio-temporal database of VIIRS Nightfire (VNF) nighttime infrared detections from the Suomi NPP, NOAA-20, and NOAA-21 satellites. The method is designed to resolve closely spaced industrial combustion sources and to [...] Read more.
We present a new method for mapping global gas flaring using a multiyear spatio-temporal database of VIIRS Nightfire (VNF) nighttime infrared detections from the Suomi NPP, NOAA-20, and NOAA-21 satellites. The method is designed to resolve closely spaced industrial combustion sources and to produce a stable, physically meaningful flare catalog suitable for long-term monitoring and emissions analysis. The method combines adaptive spatial aggregation of high-temperature detections with a hierarchical clustering that super-resolves individual flare stacks within oil and gas fields. Post-processing yields physically consistent flare footprints and attraction regions, allowing separation of closely spaced sources. Flare clusters are assigned to operational categories (e.g., upstream, midstream, LNG) using prior catalogs combined with AI-assisted expert interpretation. In this step, a multimodal large language model (LLM) provides contextual classification suggestions based on geospatial information, high-resolution daytime imagery, and detection time-series summaries, while final attribution is performed and validated by domain experts. Compared with annual flare catalogs commonly used for national flaring estimates, the new catalog demonstrates substantially improved performance. It is more selective in the presence of intense atmospheric glow from large flares, identifies approximately twice as many active flares, and localizes individual stacks with ~50 m precision, resolving emitters separated by ~400–700 m. For the well-defined class of downstream flares at LNG export facilities, the catalog achieves complete detectability. These improvements support more accurate flare inventories, facility-level attribution, and policy-relevant assessments of gas flaring activity. Full article
(This article belongs to the Section Environmental Remote Sensing)
Show Figures

Graphical abstract

23 pages, 3847 KB  
Article
DRPU-YOLO11: A Multi-Scale Model for Detecting Rice Panicles in UAV Images with Complex Infield Background
by Dongchen Huang, Zhipeng Chen, Jiajun Zhuang, Ge Song, Huasheng Huang, Feilong Li, Guogang Huang and Changyu Liu
Agriculture 2026, 16(2), 234; https://doi.org/10.3390/agriculture16020234 - 16 Jan 2026
Viewed by 1580
Abstract
In the field of precision agriculture, accurately detecting rice panicles is crucial for monitoring rice growth and managing rice production. To address the challenges posed by complex field backgrounds, including variety differences, variations across growth stages, background interference, and occlusion due to dense [...] Read more.
In the field of precision agriculture, accurately detecting rice panicles is crucial for monitoring rice growth and managing rice production. To address the challenges posed by complex field backgrounds, including variety differences, variations across growth stages, background interference, and occlusion due to dense distribution, this study develops an improved YOLO11-based rice panicle detection model, termed DRPU-YOLO11. The model incorporates a task-oriented CSP-PGMA module in the backbone to enhance multi-scale feature extraction and provide richer representations for downstream detection. In the neck network, DySample and CGDown are adopted to strengthen global contextual feature aggregation and suppress background interference for small targets. Furthermore, fine-grained P2 level information is integrated with higher-level features through a cross-scale fusion module (CSP-ONMK) to improve detection robustness in dense and occluded scenes. In addition, the PowerTAL strategy adapts quality-aware label assignment to emphasize high-quality predictions during training. The experimental results based on a self-constructed dataset demonstrate that DRPU-YOLO11 significantly outperforms baseline models in rice panicle detection under complex field environments, achieving an accuracy of 82.5%. Compared with the baseline model YOLO11 and RT-DETR, the mAP50 increases by 2.4% and 5.0%, respectively. These results indicate that the proposed task-driven design provides a practical and high-precision solution for rice panicle detection, with potential applications in rice growth monitoring and yield estimation. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

10 pages, 492 KB  
Proceeding Paper
Precision Localization of Autonomous Vehicles in Urban Environments: An Experimental Study with RFID Markers
by Svetozar Stefanov, Valentina Markova and Miroslav Markov
Eng. Proc. 2026, 122(1), 7; https://doi.org/10.3390/engproc2026122007 - 14 Jan 2026
Viewed by 132
Abstract
This paper presents an experimental study validating the feasibility of Radio Frequency Identification (RFID) marker systems as a complementary solution for autonomous vehicle (AV) localization in Global Navigation Satellite System (GNSS)-degraded urban environments. A novel synchronized dynamic testbed featuring hardware-level integration with wheel [...] Read more.
This paper presents an experimental study validating the feasibility of Radio Frequency Identification (RFID) marker systems as a complementary solution for autonomous vehicle (AV) localization in Global Navigation Satellite System (GNSS)-degraded urban environments. A novel synchronized dynamic testbed featuring hardware-level integration with wheel revolution tracking enables precise correlation of RFID marker reads with vehicle angular position. Experimental results demonstrate that multi-antenna configurations achieve consistently high read success rates (up to 99.6% at 0.5 m distance), sub-meter localization accuracy (~55 cm marker spacing), and reliable performance at average urban speeds (36 km/h simulated velocity). Spatial diversity from four strategically positioned antennas overcomes multipath interference and orientation challenges inherent to high-speed RFID reading. Processing latency remains well within the 58 ms time budget critical for autonomous navigation. These findings validate RFID’s potential for smart road infrastructure integration and demonstrate a scalable, cost-effective solution for enhancing AV safety and decision-making capabilities through contextual information transmission. Full article
Show Figures

Figure 1

18 pages, 5889 KB  
Article
High-Resolution Mapping Coastal Wetland Vegetation Using Frequency-Augmented Deep Learning Method
by Ning Gao, Xinyuan Du, Peng Xu, Erding Gao and Yixin Yang
Remote Sens. 2026, 18(2), 247; https://doi.org/10.3390/rs18020247 - 13 Jan 2026
Viewed by 123
Abstract
Coastal wetland vegetation exhibits pronounced spectral mixing, complex mosaic spatial patterns, and small target sizes, posing considerable challenges for fine-grained classification in high-resolution UAV imagery. At present, remote sensing classification of ground objects based on deep learning mainly relies on spectral and structural [...] Read more.
Coastal wetland vegetation exhibits pronounced spectral mixing, complex mosaic spatial patterns, and small target sizes, posing considerable challenges for fine-grained classification in high-resolution UAV imagery. At present, remote sensing classification of ground objects based on deep learning mainly relies on spectral and structural features, while the frequency domain features of ground objects are not fully considered. To address these issues, this study proposes a vegetation classification model that integrates spatial-domain and frequency-domain features. The model enhances global contextual modeling through a large-kernel convolution branch, while a frequency-domain interaction branch separates and fuses low-frequency structural information with high-frequency details. In addition, a shallow auxiliary supervision module is introduced to improve local detail learning and stabilize training. With a compact parameter scale suitable for real-world deployment, the proposed framework effectively adapts to high-resolution remote sensing scenarios. Experiments on typical coastal wetland vegetation including Reeds, Spartina alterniflora, and Suaeda salsa demonstrate that the proposed method consistently outperforms representative segmentation models such as UNet, DeepLabV3, TransUNet, SegFormer, D-LinkNet, and MCCA across multiple metrics including Accuracy, Recall, F1 Score, and mIoU. Overall, the results show that the proposed model effectively addresses the challenges of subtle spectral differences, pervasive species mixture, and intricate structural details, offering a robust and efficient solution for UAV-based wetland vegetation mapping and ecological monitoring. Full article
Show Figures

Figure 1

24 pages, 5571 KB  
Article
Bearing Fault Diagnosis Based on a Depthwise Separable Atrous Convolution and ASPP Hybrid Network
by Xiaojiao Gu, Chuanyu Liu, Jinghua Li, Xiaolin Yu and Yang Tian
Machines 2026, 14(1), 93; https://doi.org/10.3390/machines14010093 - 13 Jan 2026
Viewed by 129
Abstract
To address the computational redundancy, inadequate multi-scale feature capture, and poor noise robustness of traditional deep networks used for bearing vibration and acoustic signal feature extraction, this paper proposes a fault diagnosis method based on Depthwise Separable Atrous Convolution (DSAC) and Acoustic Spatial [...] Read more.
To address the computational redundancy, inadequate multi-scale feature capture, and poor noise robustness of traditional deep networks used for bearing vibration and acoustic signal feature extraction, this paper proposes a fault diagnosis method based on Depthwise Separable Atrous Convolution (DSAC) and Acoustic Spatial Pyramid Pooling (ASPP). First, the Continuous Wavelet Transform (CWT) is applied to the vibration and acoustic signals to convert them into time–frequency representations. The vibration CWT is then fed into a multi-scale feature extraction module to obtain preliminary vibration features, whereas the acoustic CWT is processed by a Deep Residual Shrinkage Network (DRSN). The two feature streams are concatenated in a feature fusion module and subsequently fed into the DSAC and ASPP modules, which together expand the effective receptive field and aggregate multi-scale contextual information. Finally, global pooling followed by a classifier outputs the bearing fault category, enabling high-precision bearing fault identification. Experimental results show that, under both clean data and multiple low signal-to-noise ratio (SNR) noise conditions, the proposed DSAC-ASPP method achieves higher accuracy and lower variance than baselines such as ResNet, VGG, and MobileNet, while requiring fewer parameters and FLOPs and exhibiting superior robustness and deployability. Full article
Show Figures

Figure 1

23 pages, 54003 KB  
Article
TRACE: Topical Reasoning with Adaptive Contextual Experts
by Jiabin Ye, Qiuyi Xin, Chu Zhang and Hengnian Qi
Big Data Cogn. Comput. 2026, 10(1), 31; https://doi.org/10.3390/bdcc10010031 - 13 Jan 2026
Viewed by 199
Abstract
Retrieval-Augmented Generation (RAG) is widely used for long-text summarization due to its efficiency and scalability. However, standard RAG methods flatten documents into independent chunks, disrupting sequential flow and thematic structure, resulting in significant loss of contextual information. This paper presents MOEGAT, a novel [...] Read more.
Retrieval-Augmented Generation (RAG) is widely used for long-text summarization due to its efficiency and scalability. However, standard RAG methods flatten documents into independent chunks, disrupting sequential flow and thematic structure, resulting in significant loss of contextual information. This paper presents MOEGAT, a novel graph-enhanced retrieval framework that addresses this limitation by explicitly modeling document structure. MOEGAT constructs an Orthogonal Context Graph to capture sequential discourse and global semantic relationships—long-range dependencies between non-adjacent text spans that reflect topical similarity and logical associations beyond local context. It then employs a query-aware Mixture-of-Experts Graph Attention Network to dynamically activate specialized reasoning pathways. Experiments conducted on three public long-text summarization datasets demonstrate that MOEGAT achieves state-of-the-art performance. Notably, on the WCEP dataset, it outperforms the previous state-of-the-art Graph of Records (GOR) baseline by 14.9%, 18.1%, and 18.4% on ROUGE-L, ROUGE-1, and ROUGE-2, respectively. These substantial gains, especially the 14.9% improvement in ROUGE-L, reflect significantly better capture of long-range coherence and thematic integrity in summaries. Ablation studies confirm the effectiveness of the orthogonal graph and Mixture-of-Experts components. Overall, this work introduces a novel structure-aware approach to RAG that explicitly models and leverages document structure through an orthogonal graph representation and query-aware Mixture-of-Experts reasoning. Full article
(This article belongs to the Special Issue Generative AI and Large Language Models)
Show Figures

Figure 1

22 pages, 3427 KB  
Article
FCS-Net: A Frequency-Spatial Coordinate and Strip-Augmented Network for SAR Oil Spill Segmentation
by Shentao Wang, Byung-Won Min, Depeng Gao and Yue Hong
J. Mar. Sci. Eng. 2026, 14(2), 168; https://doi.org/10.3390/jmse14020168 - 13 Jan 2026
Viewed by 179
Abstract
Accurate segmentation of marine oil spills in synthetic aperture radar (SAR) images is crucial for emergency response and environmental remediation. However, current deep learning methods are still limited by two long-standing bottlenecks: first, multiplicative speckle noise and complex background clutter make it difficult [...] Read more.
Accurate segmentation of marine oil spills in synthetic aperture radar (SAR) images is crucial for emergency response and environmental remediation. However, current deep learning methods are still limited by two long-standing bottlenecks: first, multiplicative speckle noise and complex background clutter make it difficult to accurately delineate actual oil spills; and second, limited receptive fields often lead to the geometric fragmentation of elongated, irregular oil films. To surmount these challenges, this paper proposes a novel framework termed the Frequency-Spatial Coordinate and Strip-Augmented Network (FCS-Net). First, we leverage the ConvNeXt-Small backbone to extract robust hierarchical features, utilizing its large kernel design to capture broad contextual information. Second, a Frequency-Spatial Coordinate Attention (FS-CA) module is proposed to integrate spatial coordinate encoding with global frequency-domain information. Third, to maintain the morphological integrity of elongated targets, we introduce a Strip-Augmented Pyramid Pooling (SAPP) module which employs anisotropic strip pooling to model long-range dependencies. Extensive experiments on the multi-source SOS dataset demonstrate the effectiveness of FCS-Net. The proposed method achieves state-of-the-art performance, reaching an mIoU of 87.78% in the Gulf of Mexico and 89.62% in the challenging Persian Gulf, outperforming strong baselines and demonstrating superior robustness in complex ocean scenarios. Full article
Show Figures

Figure 1

28 pages, 3553 KB  
Article
GCN-Embedding Swin–Unet for Forest Remote Sensing Image Semantic Segmentation
by Pingbo Liu, Gui Zhang and Jianzhong Li
Remote Sens. 2026, 18(2), 242; https://doi.org/10.3390/rs18020242 - 12 Jan 2026
Viewed by 229
Abstract
Forest resources are among the most important ecosystems on the earth. The semantic segmentation and accurate positioning of ground objects in forest remote sensing (RS) imagery are crucial to the emergency treatment of forest natural disasters, especially forest fires. Currently, most existing methods [...] Read more.
Forest resources are among the most important ecosystems on the earth. The semantic segmentation and accurate positioning of ground objects in forest remote sensing (RS) imagery are crucial to the emergency treatment of forest natural disasters, especially forest fires. Currently, most existing methods for image semantic segmentation are built upon convolutional neural networks (CNNs). Nevertheless, these techniques face difficulties in directly accessing global contextual information and accurately detecting geometric transformations within the image’s target regions. This limitation stems from the inherent locality of convolution operations, which are restricted to processing data structured in Euclidean space and confined to square-shaped regions. Inspired by the graph convolution network (GCN) with robust capabilities in processing irregular and complex targets, as well as Swin Transformers renowned for exceptional global context modeling, we present a hybrid semantic segmentation framework for forest RS imagery termed GSwin–Unet. This framework embeds the GCN model into Swin–Unet architecture to address the issue of low semantic segmentation accuracy of RS imagery in forest scenarios, which is caused by the complex texture features, diverse shapes, and unclear boundaries of land objects. GSwin–Unet features a parallel dual-encoder architecture of GCN and Swin Transformer. First, we integrate the Zero-DCE (Zero-Reference Deep Curve Estimation) algorithm into GSwin–Unet to enhance forest RS image feature representation. Second, a feature aggregation module (FAM) is proposed to bridge the dual encoders by fusing GCN-derived local aggregated features with Swin Transformer-extracted features. Our study demonstrates that, compared with the baseline models TransUnet, Swin–Unet, Unet, and DeepLab V3+, the GSwin–Unet achieves improvements of 7.07%, 5.12%, 8.94%, and 2.69% in the mean Intersection over Union (MIoU) and 3.19%, 1.72%, 4.3%, and 3.69% in the average F1 score (Ave.F1), respectively, on the RGB forest RS dataset. On the NIRGB forest RS dataset, the improvements in MIoU are 5.75%, 3.38%, 6.79%, and 2.44%, and the improvements in Ave.F1 are 4.02%, 2.38%, 4.72%, and 1.67%, respectively. Meanwhile, GSwin–Unet shows excellent adaptability on the selected GID dataset with high forest coverage, where the MIoU and Ave.F1 reach 72.92% and 84.3%, respectively. Full article
Show Figures

Figure 1

27 pages, 1843 KB  
Article
AI-Driven Modeling of Near-Mid-Air Collisions Using Machine Learning and Natural Language Processing Techniques
by Dothang Truong
Aerospace 2026, 13(1), 80; https://doi.org/10.3390/aerospace13010080 - 12 Jan 2026
Viewed by 188
Abstract
As global airspace operations grow increasingly complex, the risk of near-mid-air collisions (NMACs) poses a persistent and critical challenge to aviation safety. Traditional collision-avoidance systems, while effective in many scenarios, are limited by rule-based logic and reliance on transponder data, particularly in environments [...] Read more.
As global airspace operations grow increasingly complex, the risk of near-mid-air collisions (NMACs) poses a persistent and critical challenge to aviation safety. Traditional collision-avoidance systems, while effective in many scenarios, are limited by rule-based logic and reliance on transponder data, particularly in environments featuring diverse aircraft types, unmanned aerial systems (UAS), and evolving urban air mobility platforms. This paper introduces a novel, integrative machine learning framework designed to analyze NMAC incidents using the rich, contextual information contained within the NASA Aviation Safety Reporting System (ASRS) database. The methodology is structured around three pillars: (1) natural language processing (NLP) techniques are applied to extract latent topics and semantic features from pilot and crew incident narratives; (2) cluster analysis is conducted on both textual and structured incident features to empirically define distinct typologies of NMAC events; and (3) supervised machine learning models are developed to predict pilot decision outcomes (evasive action vs. no action) based on integrated data sources. The analysis reveals seven operationally coherent topics that reflect communication demands, pattern geometry, visibility challenges, airspace transitions, and advisory-driven interactions. A four-cluster solution further distinguishes incident contexts ranging from tower-directed approaches to general aviation pattern and cruise operations. The Random Forest model produces the strongest predictive performance, with topic-based indicators, miss distance, altitude, and operating rule emerging as influential features. The results show that narrative semantics provide measurable signals of coordination load and acquisition difficulty, and that integrating text with structured variables enhances the prediction of maneuvering decisions in NMAC situations. These findings highlight opportunities to strengthen radio practice, manage pattern spacing, improve mixed equipage awareness, and refine alerting in short-range airport area encounters. Full article
(This article belongs to the Section Air Traffic and Transportation)
Show Figures

Figure 1

28 pages, 5526 KB  
Article
Symmetry-Aware SwinUNet with Integrated Attention for Transformer-Based Segmentation of Thyroid Ultrasound Images
by Ammar Oad, Imtiaz Hussain Koondhar, Feng Dong, Weibing Liu, Beiji Zou, Weichun Liu, Yun Chen and Yaoqun Wu
Symmetry 2026, 18(1), 141; https://doi.org/10.3390/sym18010141 - 10 Jan 2026
Viewed by 241
Abstract
Accurate segmentation of thyroid nodules in ultrasound images remains challenging due to low contrast, speckle noise, and inter-patient variability that disrupt the inherent spatial symmetry of thyroid anatomy. This study proposes a symmetry-aware SwinUNet framework with integrated spatial attention for thyroid nodule segmentation. [...] Read more.
Accurate segmentation of thyroid nodules in ultrasound images remains challenging due to low contrast, speckle noise, and inter-patient variability that disrupt the inherent spatial symmetry of thyroid anatomy. This study proposes a symmetry-aware SwinUNet framework with integrated spatial attention for thyroid nodule segmentation. The hierarchical window-based Swin Transformer encoder preserves spatial symmetry and scale consistency while capturing both global contextual information and fine-grained local features. Attention modules in the decoder emphasize symmetry consistent anatomical regions and asymmetric nodule boundaries, effectively suppressing irrelevant background responses. The proposed method was evaluated on the publicly available TN3K thyroid ultrasound dataset. Experimental results demonstrate strong performance, achieving a Dice Similarity Coefficient of 85.51%, precision of 87.05%, recall of 89.13%, an IoU of 78.00%, accuracy of 97.02%, and an AUC of 99.02%. Compared with the baseline model, the proposed approach improves the IoU and Dice score by 15.38% and 12.05%, respectively, confirming its ability to capture symmetry-preserving nodule morphology and boundary asymmetry. These findings indicate that the proposed symmetry-aware SwinUNet provides a robust and clinically promising solution for thyroid ultrasound image analysis and computer-aided diagnosis. Full article
Show Figures

Figure 1

23 pages, 4184 KB  
Article
A New Encoding Architecture Based on Shift Multilayer Perceptron and Transformer for Medical Image Segmentation
by Hepeng Zhong, Jieqiong Yang, Yingfei Wu and Jizheng Yi
Sensors 2026, 26(2), 449; https://doi.org/10.3390/s26020449 - 9 Jan 2026
Viewed by 239
Abstract
Accurate medical image segmentation plays a crucial role in clinical diagnosis by precisely delineating diseased tissues and organs from various medical imaging modalities. However, existing segmentation methods often fail to effectively capture low-level structural details and exhibit inconsistencies in feature connection, which may [...] Read more.
Accurate medical image segmentation plays a crucial role in clinical diagnosis by precisely delineating diseased tissues and organs from various medical imaging modalities. However, existing segmentation methods often fail to effectively capture low-level structural details and exhibit inconsistencies in feature connection, which may compromise diagnostic reliability. To address these limitations, this study proposes a novel Multilayer Perceptron–Transformer encoding architecture that integrates the Shift Multilayer Perceptron and Transformer mechanisms. Specifically, a SENet-based Atrous Spatial Pyramid Pooling module is designed to extract multi-scale contextual representations, while the Shift MLP refines underlying spatial features. Moreover, a channel–feature aggregation attention module is introduced to strengthen information flow between the encoder and decoder layers. Experimental results on the Automatic Cardiac Diagnostic Challenge dataset show an average Dice Similarity Coefficient (DSC) of 87.01% (83.32% for the right ventricle, 90.90% for the left ventricle, and 86.83% for the myocardium). On the Synapse multi-organ segmentation dataset, the proposed model achieves an average DSC of 79.35% and a 95% Haus Dorff Distance of 20.07 mm. These results demonstrate that MPT effectively captures both local and global anatomical structures, providing reliable support for clinical diagnosis. Full article
(This article belongs to the Special Issue Vision- and Image-Based Biomedical Diagnostics—2nd Edition)
Show Figures

Figure 1

22 pages, 5463 KB  
Article
SRG-YOLO: Star Operation and Restormer-Based YOLOv11 via Global Context for Vehicle Object Detection
by Wei Song, Junying Min and Jiaqi Zhao
Automation 2026, 7(1), 15; https://doi.org/10.3390/automation7010015 - 7 Jan 2026
Viewed by 177
Abstract
Recently, these conventional object detection methods have certain defects that must be overcome, such as insufficient detection accuracy in complex scenes and low computational efficiency. Then, this paper proposes a Star operation and Restormer-based YOLOv11 model that leverages global context for vehicle detection [...] Read more.
Recently, these conventional object detection methods have certain defects that must be overcome, such as insufficient detection accuracy in complex scenes and low computational efficiency. Then, this paper proposes a Star operation and Restormer-based YOLOv11 model that leverages global context for vehicle detection (SRG-YOLO), which aims to enhance both detection accuracy and efficiency in complex environments. Firstly, during the optimization of YOLOv11n architecture, a Star block is introduced. By enhancing non-linear feature representation, this Star block improves the original C3K2 module, thereby strengthening multi-scale feature fusion and consequently boosting detection accuracy in complex scenarios. Secondly, for the detection heads of YOLOv11n, Restormer is incorporated via the improved C3K2 module to explicitly leverage spatial prior information, optimize the self-attention mechanism, and augment long-range pixel dependencies of YOLOv11n. This integration not only reduces computational complexity but also improves detection precision and overall efficiency through more refined feature modeling. Thirdly, a Context-guided module is integrated to enhance the ability to capture object details using global context. In complex backgrounds, it effectively combines local features with their contextual information, substantially improving the detection robustness of YOLOv11n. Finally, experiments on the VisDrone2019, KITTI, and UA-DETRAC datasets illustrate that SRG-YOLO achieves superior vehicle detection accuracy in complex scenes compared to conventional methods, with particular advantages in small object detection. Full article
(This article belongs to the Collection Automation in Intelligent Transportation Systems)
Show Figures

Figure 1

22 pages, 10194 KB  
Article
MBFI-Net: Multi-Branch Feature Interaction Network for Semantic Change Detection
by Qing Ding, Fengyan Wang, Kaiyuan Sun, Weilong Chen, Mingchang Wang and Gui Cheng
Remote Sens. 2026, 18(1), 179; https://doi.org/10.3390/rs18010179 - 5 Jan 2026
Viewed by 308
Abstract
Semantic change detection (SCD) effectively captures ground object transition information within change regions, delivering more comprehensive and detailed results than binary change detection (BCD) tasks. The existing multi-task SCD models enable parallel processing of segmentation and BCD of bi-temporal remote sensing images, but [...] Read more.
Semantic change detection (SCD) effectively captures ground object transition information within change regions, delivering more comprehensive and detailed results than binary change detection (BCD) tasks. The existing multi-task SCD models enable parallel processing of segmentation and BCD of bi-temporal remote sensing images, but they still have shortcomings in feature mining, interaction, and cross-task transfer. To address these limitations, a multi-branch feature interaction network (MBFI-Net) is proposed. MBFI-Net designs parallel encoding branches with attention mechanisms that enhance semantic change perception by jointly modeling global contextual patterns and local details. In addition, MBFI-Net proposes bi-temporal feature interaction (BTFI) and cross-task feature transfer (CTFT) modules to improve feature diversity and representativeness, and combines with prior logical relationship constraints to improve SCD performance. Comparative and ablation studies on the SECOND and Landsat-SCD datasets highlight the superiority and robustness of MBFI-Net, which achieves SeKs of 0.2117 and 0.5543, respectively. Furthermore, MBFI-Net strikes a balance between SCD results and model complexity and has superior detection performance for semantic change categories with a small proportion. Full article
Show Figures

Figure 1

Back to TopTop