Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (56)

Search Parameters:
Keywords = successive attention fusion module

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 2225 KB  
Article
A Knowledge-Guide Data-Driven Model with Selective Wavelet Kernel Fusion Neural Network for Gearbox Intelligent Fault Diagnosis
by Nan Zhuang, Zhaogang Ren, Dongyao Yang, Xu Tian and Yingwu Wang
Sensors 2025, 25(24), 7656; https://doi.org/10.3390/s25247656 - 17 Dec 2025
Viewed by 192
Abstract
The gearbox is a critical component in modern industrial systems, directly determining the operational reliability of machinery. Therefore, effective fault diagnosis is essential to ensure its proper functioning. Modern diagnostic approaches often employ accelerometers to monitor vibration signals and apply data-driven techniques for [...] Read more.
The gearbox is a critical component in modern industrial systems, directly determining the operational reliability of machinery. Therefore, effective fault diagnosis is essential to ensure its proper functioning. Modern diagnostic approaches often employ accelerometers to monitor vibration signals and apply data-driven techniques for fault identification, achieving considerable success. However, deep learning-based methods still face limitations due to their “black-box” nature and lack of interpretability. To address these issues, this paper proposes a knowledge-guided selective wavelet kernel fusion neural network. By integrating diagnostic domain knowledge into data-driven modeling, the proposed method enhances both the interpretability and diagnostic performance of intelligent fault diagnosis systems. First, a multi-kernel convolutional module is designed based on domain knowledge and embedded into a Modern Temporal Convolutional Network. Then, an attention-based selective wavelet kernel fusion strategy is introduced to adaptively fuse kernels according to the distribution of different datasets. Finally, the effectiveness of the proposed method is validated on two public datasets. Experimental results demonstrate that the approach not only provides prior interpretability, which overcoming the black-box limitation of deep learning, but also further improves diagnostic accuracy. Full article
(This article belongs to the Special Issue Deep Learning Based Intelligent Fault Diagnosis)
Show Figures

Figure 1

26 pages, 22711 KB  
Article
Advanced Servo Control and Adaptive Path Planning for a Vision-Aided Omnidirectional Launch Platform in Sports-Training Applications
by Shuai Wang, Yinuo Xie, Kangyi Huang, Jun Lang, Qi Liu and Yaoming Zhuang
Actuators 2025, 14(12), 614; https://doi.org/10.3390/act14120614 - 15 Dec 2025
Viewed by 265
Abstract
A system-level scheme that couples a multi-dimensional attention-fused vision model and an improved Dijkstra planner is proposed for basketball robots in complex scenes. Fast-moving object detection, cluttered background recognition, and real-time path decision are targeted. For vision, the proposed YOLO11 with Multi-dimensional Attention [...] Read more.
A system-level scheme that couples a multi-dimensional attention-fused vision model and an improved Dijkstra planner is proposed for basketball robots in complex scenes. Fast-moving object detection, cluttered background recognition, and real-time path decision are targeted. For vision, the proposed YOLO11 with Multi-dimensional Attention Fusion (YOLO11-MAF) is equipped with four modules: Coordinate Attention (CoordAttention), Efficient Channel Attention (ECA), Multi-Scale Channel Attention (MSCA), and Large-Separable Kernel Attention (LSKA). Detection accuracy and robustness for high-speed basketballs are raised. For planning, an improved Dijkstra algorithm is proposed. Binary heap optimization and heuristic fusion cut time complexity from O(V2) to O((V+E)logV). Redundant expansions are removed and planning speed is increased. A complete robot platform integrating mechanical, electronic, and software components is constructed. End-to-end experiments show the improved vision model raises mAP@0.5 by 0.7% while keeping real-time frames per second (FPS). The improved path planning algorithm cuts average compute time by 16% and achieves over 95% obstacle avoidance success. The work offers a new approach for real-time perception and autonomous navigation of intelligent sport robots. It lays a basis for future multi-sensor fusion and adaptive path planning research. Full article
Show Figures

Figure 1

34 pages, 9676 KB  
Article
Multi-Attention Meets Pareto Optimization: A Reinforcement Learning Method for Adaptive UAV Formation Control
by Li Zheng, Junjie Zeng, Long Qin and Rusheng Ju
Drones 2025, 9(12), 845; https://doi.org/10.3390/drones9120845 - 8 Dec 2025
Viewed by 434
Abstract
Autonomous multi-UAV formation control in cluttered urban environments remains challenging due to partial observability, dense and dynamic obstacles, and conflicting objectives (task efficiency, energy use, and safety). Yet many MARL-based approaches still collapse vector-valued objectives into a single hand-tuned reward and lack selective [...] Read more.
Autonomous multi-UAV formation control in cluttered urban environments remains challenging due to partial observability, dense and dynamic obstacles, and conflicting objectives (task efficiency, energy use, and safety). Yet many MARL-based approaches still collapse vector-valued objectives into a single hand-tuned reward and lack selective information fusion, leading to brittle trade-offs and poor scalability in urban clutter. We introduce a model-agnostic MARL framework—instantiated on MADDPG for concreteness—that augments a CTDE backbone with three lightweight attention modules (self, inter-agent, and entity) for selective information fusion, and a Pareto optimization module that maintains a compact archive of non-dominated policies to adaptively guide objective trade-offs using simple, interpretable rewards rather than fragile weightings. On city-scale navigation tasks, the approach improves final team success by 13–27 percentage points for N = 2–5 while simultaneously reducing collisions, tightening formation, and lowering control effort. These gains require no algorithm-specific tuning and hold consistently across the tested team sizes (N = 2–5), underscoring a stronger safety–efficiency trade-off and robust applicability in cluttered, partially observable settings. Full article
(This article belongs to the Section Artificial Intelligence in Drones (AID))
Show Figures

Figure 1

25 pages, 5836 KB  
Article
MRSliceNet: Multi-Scale Recursive Slice and Context Fusion Network for Instance Segmentation of Leaves from Plant Point Clouds
by Shan Liu, Guangshuai Wang, Hongbin Fang, Min Huang, Tengping Jiang and Yongjun Wang
Plants 2025, 14(21), 3349; https://doi.org/10.3390/plants14213349 - 31 Oct 2025
Viewed by 577
Abstract
Plant phenotyping plays a vital role in connecting genotype to environmental adaptability, with important applications in crop breeding and precision agriculture. Traditional leaf measurement methods are laborious and destructive, while modern 3D sensing technologies like LiDAR provide high-resolution point clouds but face challenges [...] Read more.
Plant phenotyping plays a vital role in connecting genotype to environmental adaptability, with important applications in crop breeding and precision agriculture. Traditional leaf measurement methods are laborious and destructive, while modern 3D sensing technologies like LiDAR provide high-resolution point clouds but face challenges in automatic leaf segmentation due to occlusion, geometric similarity, and uneven point density. To address these challenges, we propose MRSliceNet, an end-to-end deep learning framework inspired by human visual cognition. The network integrates three key components: a Multi-scale Recursive Slicing Module (MRSM) for detailed local feature extraction, a Context Fusion Module (CFM) that combines local and global features through attention mechanisms, and an Instance-Aware Clustering Head (IACH) that generates discriminative embeddings for precise instance separation. Extensive experiments on two challenging datasets show that our method establishes new state-of-the-art performance, achieving AP of 55.04%/53.78%, AP50 of 65.37%/64.00%, and AP25 of 74.68%/73.45% on Dataset A and Dataset B, respectively. The proposed framework not only produces clear boundaries and reliable instance identification but also provides an effective solution for automated plant phenotyping, as evidenced by its successful implementation in real-world agricultural research pipelines. Full article
Show Figures

Figure 1

17 pages, 696 KB  
Review
Regulatory Role of Zinc in Acute Promyelocytic Leukemia: Cellular and Molecular Aspects with Therapeutic Implications
by Norihiro Ikegami, István Szegedi, Csongor Kiss and Miklós Petrás
Int. J. Mol. Sci. 2025, 26(19), 9685; https://doi.org/10.3390/ijms26199685 - 4 Oct 2025
Viewed by 1055
Abstract
Acute promyelocytic leukemia (APL) is a rare subtype of acute myeloid leukemia (AML) characterized by chromosomal translocation forming the fusion protein that blocks the differentiation of myeloid progenitors and increases the self-renewal of leukemia cells. The introduction of all-trans retinoic acid (ATRA) and [...] Read more.
Acute promyelocytic leukemia (APL) is a rare subtype of acute myeloid leukemia (AML) characterized by chromosomal translocation forming the fusion protein that blocks the differentiation of myeloid progenitors and increases the self-renewal of leukemia cells. The introduction of all-trans retinoic acid (ATRA) and arsenic trioxide (ATO) has dramatically improved outcomes in APL, making it a leading example of successful treatment through differentiation of cancer cells. However, life-threatening side effects and treatment resistance may develop; therefore, modulation of the safety and efficacy of these drugs may contribute to further improving treatment results. Recently, zinc, involved in the structure and function of transcription factors, has received special attention for its potential role in the development and treatment response of cancer. Zinc homeostasis is disrupted in APL, with intracellular accumulation stabilizing oncogenic proteins. Zinc depletion promotes degradation of PML–RARA and induces apoptosis, while supplementation enhances genotoxic stress in leukemic cells but protects normal hematopoiesis. Zinc also regulates key transcription factors involved in differentiation and proliferation, including RUNX2, KLF4, GFI1, and CREB. In this review, we examine how zinc may impact zinc-finger (ZnF) and non-ZnF transcription factors and differentiation therapy in APL, thereby identifying potential strategies to enhance treatment efficacy and minimize side effects. Full article
(This article belongs to the Special Issue Molecular Mechanism of Acute Myeloid Leukemia)
Show Figures

Graphical abstract

26 pages, 11731 KB  
Article
Sow Estrus Detection Based on the Fusion of Vulvar Visual Features
by Jianyu Fang, Lu Yang, Xiangfang Tang, Shuqing Han, Guodong Cheng, Yali Wang, Liwen Chen, Baokai Zhao and Jianzhai Wu
Animals 2025, 15(18), 2709; https://doi.org/10.3390/ani15182709 - 16 Sep 2025
Viewed by 936
Abstract
Under large-scale farming conditions, automated sow estrus detection is crucial for improving reproductive efficiency, optimizing breeding management, and reducing labor costs. Conventional estrus detection relies heavily on human expertise, a practice that introduces subjective variability and consequently diminishes both accuracy and efficiency. Failure [...] Read more.
Under large-scale farming conditions, automated sow estrus detection is crucial for improving reproductive efficiency, optimizing breeding management, and reducing labor costs. Conventional estrus detection relies heavily on human expertise, a practice that introduces subjective variability and consequently diminishes both accuracy and efficiency. Failure to identify estrus promptly and pair animals effectively lowers breeding success rates and drives up overall husbandry costs. In response to the need for the automated detection of sows’ estrus states in large-scale pig farms, this study proposes a method for detecting sows’ vulvar status and estrus based on multi-dimensional feature crossing. The method adopts a dual optimization strategy: First, the Bi-directional Feature Pyramid Network—Selective Decoding Integration (BiFPN-SDI) module performs the bidirectional, weighted fusion of the backbone’s low-level texture and high-level semantic, retaining the multi-dimensional cues most relevant to vulvar morphology and producing a scale-aligned, minimally redundant feature map. Second, by embedding a Spatially Enhanced Attention Module head (SEAM-Head) channel attention mechanism into the detection head, the model further amplifies key hyperemia-related signals, while suppressing background noise, thereby enabling cooperative and more precise bounding box localization. To adapt the model for edge computing environments, Masked Generative Distillation (MGD) knowledge distillation is introduced to compress the model while maintaining the detection speed and accuracy. Based on the bounding box of the vulvar region, the aspect ratio of the target area and the red saturation features derived from a dual-threshold method in the HSV color space are used to construct a lightweight Multilayer Perceptron (MLP) classification model for estrus state determination. The network was trained on 1400 annotated samples, which were divided into training, testing, and validation sets in an 8:1:1 ratio. On-farm evaluations in commercial pig facilities show that the proposed system attains an 85% estrus detection success rate. Following lightweight optimization, inference latency fell from 24.29 ms to 18.87 ms, and the model footprint was compressed from 32.38 MB to 3.96 MB in the same machine, while maintaining a mean Average Precision (mAP) of 0.941; the accuracy penalty from model compression was kept below 1%. Moreover, the model demonstrates robust performance under complex lighting and occlusion conditions, enabling real-time processing from vulvar localization to estrus detection, and providing an efficient and reliable technical solution for automated estrus monitoring in large-scale pig farms. Full article
(This article belongs to the Special Issue Application of Precision Farming in Pig Systems)
Show Figures

Figure 1

16 pages, 4344 KB  
Article
Recombinant Production of a TRAF-Domain Lectin from Cauliflower: A Soluble Expression Strategy for Functional Protein Recovery in E. coli
by Ana Káren de Mendonça Ludgero, Ana Luísa Aparecida da Silva, Luiz Henrique Cruz, Camila Aparecida Coelho Brazão, Kelly Maria Hurley Taylor, Leandro Licursi de Oliveira, Caio Roberto Soares Bragança and Christiane Eliza Motta Duarte
Int. J. Mol. Sci. 2025, 26(17), 8287; https://doi.org/10.3390/ijms26178287 - 26 Aug 2025
Cited by 1 | Viewed by 1311
Abstract
Lectins are glycan-binding proteins involved in diverse biological processes and have gained attention for their potential applications in biotechnology and immunomodulation. BOL (Brassica oleracea lectin) is a unique ~34 kDa lectin isolated from Brassica oleracea var. botrytis, composed exclusively of TRAF-like [...] Read more.
Lectins are glycan-binding proteins involved in diverse biological processes and have gained attention for their potential applications in biotechnology and immunomodulation. BOL (Brassica oleracea lectin) is a unique ~34 kDa lectin isolated from Brassica oleracea var. botrytis, composed exclusively of TRAF-like domains, where TRAF stands for tumor necrosis factor receptor–associated factor. To overcome the limitations of plant-based extraction, we aimed to produce recombinant BOL in Escherichia coli. Various strains and expression vectors were tested under distinct induction conditions to optimize solubility and yield. While expression using pET28a was unsuccessful, GST-tagged BOL was efficiently expressed in E. coli BL21-R3-pRARE2(DE3) and purified using affinity chromatography. Functional assays demonstrated that the recombinant protein retained lectin activity, as evidenced by hemagglutination of goat erythrocytes. Protein identity was confirmed by MALDI-TOF/TOF mass spectrometry, with tryptic peptides matching the BOL lectin sequence in the National Center for Biotechnology Information (NCBI) database. Our findings highlight the importance of codon optimization, temperature modulation, and fusion tag selection for the successful expression of eukaryotic lectins in E. coli. This work provides a platform for future functional studies of BOL and supports its potential application in plant immunity and biomedical research. Full article
(This article belongs to the Special Issue Glycoconjugates: From Structure to Therapeutic Application)
Show Figures

Graphical abstract

18 pages, 4865 KB  
Article
A Multi-Scale Cross-Layer Fusion Method for Robotic Grasping Detection
by Chengxuan Huang, Jing Xu, Xinyu Cai and Shiying Shen
Technologies 2025, 13(8), 357; https://doi.org/10.3390/technologies13080357 - 13 Aug 2025
Viewed by 1014
Abstract
Measurement of grasp configurations (position, orientation, and width) in unstructured environments is critical for robotic systems. Accurate and robust prediction relies on rich multi-scale object representations; however, detail loss and fusion conflicts in multi-scale processing often cause measurement errors, particularly for complex objects. [...] Read more.
Measurement of grasp configurations (position, orientation, and width) in unstructured environments is critical for robotic systems. Accurate and robust prediction relies on rich multi-scale object representations; however, detail loss and fusion conflicts in multi-scale processing often cause measurement errors, particularly for complex objects. This study proposes a multi-scale and cross-layer fusion grasp detection network (MCFG-Net) based on a skip-connected encoder–decoder architecture. The sampling module in the encoder–decoder is optimized, and the multi-scale fusion strategy is improved, enabling pixel-level grasp rectangles to be generated in real time. A multi-scale spatial feature enhancement module (MSFEM) addresses spatial detail loss in traditional feature pyramids and preserves spatial consistency by capturing contextual information within the same scale. In addition, a cascaded fusion attention module (CFAM) is designed to assist skip connections and mitigate redundant information and semantic mismatch during feature fusion. Experimental results show that MCFG-Net achieves grasp detection accuracies of 99.62% ± 0.11% on the Cornell dataset and 94.46% ± 0.22% on the Jacquard dataset. Real-world tests on an AUBO i5 robot yield success rates of 98.5% for single-target and 95% for multi-target grasping tasks, demonstrating practical applicability in unstructured environments. Full article
(This article belongs to the Special Issue AI Robotics Technologies and Their Applications)
Show Figures

Graphical abstract

21 pages, 7677 KB  
Article
Hyperspectral Imaging Combined with a Dual-Channel Feature Fusion Model for Hierarchical Detection of Rice Blast
by Yuan Qi, Tan Liu, Songlin Guo, Peiyan Wu, Jun Ma, Qingyun Yuan, Weixiang Yao and Tongyu Xu
Agriculture 2025, 15(15), 1673; https://doi.org/10.3390/agriculture15151673 - 2 Aug 2025
Cited by 1 | Viewed by 1346
Abstract
Rice blast caused by Magnaporthe oryzae is a major cause of yield reductions and quality deterioration in rice. Therefore, early detection of the disease is necessary for controlling the spread of rice blast. This study proposed a dual-channel feature fusion model (DCFM) to [...] Read more.
Rice blast caused by Magnaporthe oryzae is a major cause of yield reductions and quality deterioration in rice. Therefore, early detection of the disease is necessary for controlling the spread of rice blast. This study proposed a dual-channel feature fusion model (DCFM) to achieve effective identification of rice blast. The DCFM model extracted spectral features using successive projection algorithm (SPA), random frog (RFrog), and competitive adaptive reweighted sampling (CARS), and extracted spatial features from spectral images using MobileNetV2 combined with the convolutional block attention module (CBAM). Then, these features were fused using the feature fusion adaptive conditioning module in DCFM and input into the fully connected layer for disease identification. The results show that the model combining spectral and spatial features was superior to the classification models based on single features for rice blast detection, with OA and Kappa higher than 90% and 88%, respectively. The DCFM model based on SPA screening obtained the best results, with an OA of 96.72% and a Kappa of 95.97%. Overall, this study enables the early and accurate identification of rice blast, providing a rapid and reliable method for rice disease monitoring and management. It also offers a valuable reference for the detection of other crop diseases. Full article
Show Figures

Figure 1

22 pages, 6487 KB  
Article
An RGB-D Vision-Guided Robotic Depalletizing System for Irregular Camshafts with Transformer-Based Instance Segmentation and Flexible Magnetic Gripper
by Runxi Wu and Ping Yang
Actuators 2025, 14(8), 370; https://doi.org/10.3390/act14080370 - 24 Jul 2025
Viewed by 1219
Abstract
Accurate segmentation of densely stacked and weakly textured objects remains a core challenge in robotic depalletizing for industrial applications. To address this, we propose MaskNet, an instance segmentation network tailored for RGB-D input, designed to enhance recognition performance under occlusion and low-texture conditions. [...] Read more.
Accurate segmentation of densely stacked and weakly textured objects remains a core challenge in robotic depalletizing for industrial applications. To address this, we propose MaskNet, an instance segmentation network tailored for RGB-D input, designed to enhance recognition performance under occlusion and low-texture conditions. Built upon a Vision Transformer backbone, MaskNet adopts a dual-branch architecture for RGB and depth modalities and integrates multi-modal features using an attention-based fusion module. Further, spatial and channel attention mechanisms are employed to refine feature representation and improve instance-level discrimination. The segmentation outputs are used in conjunction with regional depth to optimize the grasping sequence. Experimental evaluations on camshaft depalletizing tasks demonstrate that MaskNet achieves a precision of 0.980, a recall of 0.971, and an F1-score of 0.975, outperforming a YOLO11-based baseline. In an actual scenario, with a self-designed flexible magnetic gripper, the system maintains a maximum grasping error of 9.85 mm and a 98% task success rate across multiple camshaft types. These results validate the effectiveness of MaskNet in enabling fine-grained perception for robotic manipulation in cluttered, real-world scenarios. Full article
(This article belongs to the Section Actuators for Robotics)
Show Figures

Figure 1

19 pages, 4801 KB  
Article
Attention-Enhanced CNN-LSTM Model for Exercise Oxygen Consumption Prediction with Multi-Source Temporal Features
by Zhen Wang, Yingzhe Song, Lei Pang, Shanjun Li and Gang Sun
Sensors 2025, 25(13), 4062; https://doi.org/10.3390/s25134062 - 29 Jun 2025
Cited by 4 | Viewed by 1250
Abstract
Dynamic oxygen uptake (VO2) reflects moment-to-moment changes in oxygen consumption during exercise and underpins training design, performance enhancement, and clinical decision-making. We tackled two key obstacles—the limited fusion of heterogeneous sensor data and inadequate modeling of long-range temporal patterns—by integrating wearable [...] Read more.
Dynamic oxygen uptake (VO2) reflects moment-to-moment changes in oxygen consumption during exercise and underpins training design, performance enhancement, and clinical decision-making. We tackled two key obstacles—the limited fusion of heterogeneous sensor data and inadequate modeling of long-range temporal patterns—by integrating wearable accelerometer and heart-rate streams with a convolutional neural network–LSTM (CNN-LSTM) architecture and optional attention modules. Physiological signals and VO2 were recorded from 21 adults through resting assessment and cardiopulmonary exercise testing. The results showed that pairing accelerometer with heart-rate inputs improves prediction compared with considering the heart rate alone. The baseline CNN-LSTM reached R2 = 0.946, outperforming a plain LSTM (R2 = 0.926) thanks to stronger local spatio-temporal feature extraction. Introducing a spatial attention mechanism raised accuracy further (R2 = 0.962), whereas temporal attention reduced it (R2 = 0.930), indicating that attention success depends on how well the attended features align with exercise dynamics. Stacking both attentions (spatio-temporal) yielded R2 = 0.960, slightly below the value for spatial attention alone, implying that added complexity does not guarantee better performance. Across all models, prediction errors grew during high-intensity bouts, highlighting a bottleneck in capturing non-linear physiological responses under heavy load. These findings inform architecture selection for wearable metabolic monitoring and clarify when attention mechanisms add value. Full article
(This article belongs to the Special Issue Sensors for Physiological Monitoring and Digital Health)
Show Figures

Figure 1

24 pages, 25315 KB  
Article
PAMFPN: Position-Aware Multi-Kernel Feature Pyramid Network with Adaptive Sparse Attention for Robust Object Detection in Remote Sensing Imagery
by Xiaofei Yang, Suihua Xue, Lin Li, Sihuan Li, Yudong Fang, Xiaofeng Zhang and Xiaohui Huang
Remote Sens. 2025, 17(13), 2213; https://doi.org/10.3390/rs17132213 - 27 Jun 2025
Cited by 1 | Viewed by 1515
Abstract
Deep learning methods have achieved remarkable success in remote sensing object detection. Existing object detection methods focus on integrating convolutional neural networks (CNNs) and Transformer networks to explore local and global representations to improve performance. However, existing methods relying on fixed convolutional kernels [...] Read more.
Deep learning methods have achieved remarkable success in remote sensing object detection. Existing object detection methods focus on integrating convolutional neural networks (CNNs) and Transformer networks to explore local and global representations to improve performance. However, existing methods relying on fixed convolutional kernels and dense global attention mechanisms suffer from computational redundancy and insufficient discriminative feature extraction, particularly for small and rotation-sensitive targets. To address these limitations, we propose a Dynamic Multi-Kernel Position-Aware Feature Pyramid Network (PAMFPN), which integrates adaptive sparse position modeling and multi-kernel dynamic fusion to achieve robust feature representation. Firstly, we design a position-interactive context module (PICM) that incorporates distance-aware sparse attention and dynamic positional encoding. It selectively focuses computation on sparse targets through a decay function that suppresses background noise while enhancing spatial correlations of critical regions. Secondly, we design a dual-kernel adaptive fusion (DKAF) architecture by combining region-sensitive attention (RSA) and reconfigurable context aggregation (RCA). RSA employs orthogonal large-kernel convolutions to capture anisotropic spatial features for arbitrarily oriented targets, while RCA dynamically adjusts the kernel scales based on content complexity, effectively addressing scale variations and intraclass diversity. Extensive experiments on three benchmark datasets (DOTA-v1.0, SSDD, HWPUVHR-10) demonstrate the effectiveness and versatility of the proposed PAMFPN. This work bridges the gap between efficient computation and robust feature fusion in remote sensing detection, offering a universal solution for real-world applications. Full article
(This article belongs to the Special Issue AI-Driven Hyperspectral Remote Sensing of Atmosphere and Land)
Show Figures

Figure 1

18 pages, 3721 KB  
Article
Haptic–Vision Fusion for Accurate Position Identification in Robotic Multiple Peg-in-Hole Assembly
by Jinlong Chen, Deming Luo, Zhigang Xiao, Minghao Yang, Xingguo Qin and Yongsong Zhan
Electronics 2025, 14(11), 2163; https://doi.org/10.3390/electronics14112163 - 26 May 2025
Viewed by 1665
Abstract
Multi-peg-hole assembly is a fundamental process in robotic manufacturing, particularly for circular aviation electrical connectors (CAECs) that require precise axial alignment. However, CAEC assembly poses significant challenges due to small apertures, posture disturbances, and the need for high error tolerance. This paper proposes [...] Read more.
Multi-peg-hole assembly is a fundamental process in robotic manufacturing, particularly for circular aviation electrical connectors (CAECs) that require precise axial alignment. However, CAEC assembly poses significant challenges due to small apertures, posture disturbances, and the need for high error tolerance. This paper proposes a dual-stream Siamese network (DSSN) framework that fuses visual and tactile modalities to achieve accurate position identification in six-degree-of-freedom robotic connector assembly tasks. The DSSN employs ConvNeXt for visual feature extraction and SE-ResNet-50 with integrated attention mechanisms for tactile feature extraction, while a gated attention module adaptively fuses multimodal features. A bidirectional long short-term memory (Bi-LSTM) recurrent neural network is introduced to jointly model spatiotemporal deviations in position and orientation. Compared with state-of-the-art methods, the proposed DSSN achieves improvements of approximately 7.4%, 5.7%, and 5.4% in assembly success rates after 1, 5, and 10 buckling iterations, respectively. Experimental results validate that the integration of multimodal adaptive fusion and sequential spatiotemporal learning enables robust and precise robotic connectors assembly under high-tolerance conditions. Full article
Show Figures

Figure 1

23 pages, 14757 KB  
Article
SwinNowcast: A Swin Transformer-Based Model for Radar-Based Precipitation Nowcasting
by Zhuang Li, Zhenyu Lu, Yizhe Li and Xuan Liu
Remote Sens. 2025, 17(9), 1550; https://doi.org/10.3390/rs17091550 - 27 Apr 2025
Cited by 4 | Viewed by 2992
Abstract
Precipitation nowcasting is pivotal in monitoring extreme weather events and issuing early warnings for meteorological disasters. However, the inherent complexity of precipitation systems, coupled with their nonlinear spatiotemporal evolution, poses significant challenges for traditional numerical weather prediction methods in capturing multi-scale details effectively. [...] Read more.
Precipitation nowcasting is pivotal in monitoring extreme weather events and issuing early warnings for meteorological disasters. However, the inherent complexity of precipitation systems, coupled with their nonlinear spatiotemporal evolution, poses significant challenges for traditional numerical weather prediction methods in capturing multi-scale details effectively. Existing deep learning models similarly struggle to simultaneously capture local multi-scale features and global long-term spatiotemporal dependencies. To tackle this challenge, we propose SwinNowcast, a deep learning model based on the Swin Transformer architecture. Through the novel design of a multi-scale feature balancing module (M-FBM), the model dynamically integrates local-scale features with global spatiotemporal dependencies. Specifically, the multi-scale convolutional block attention module (MSCBAM) captures local multi-scale features, while the gated attention feature fusion unit (GAFFU) adaptively regulates the fusion intensity, thereby enhancing spatial structure and temporal continuity in a synergistic manner. Experiments were performed on the precipitation dataset from the Royal Netherlands Meteorological Institute (KNMI) under thresholds of 0.5 mm, 5 mm, and 10 mm. The results indicate that SwinNowcast surpasses six state-of-the-art approaches regarding the critical success index (CSI) and the Heidke skill score (HSS), while markedly reducing the false alarm rate (FAR). The proposed model holds substantial practical value in applications such as short-term heavy rainfall monitoring and urban flood early warning, offering effective technological support for meteorological disaster mitigation. Full article
Show Figures

Figure 1

17 pages, 4721 KB  
Article
Deep Learning Model for Precipitation Nowcasting Based on Residual and Attention Mechanisms
by Zhan Zhang, Qingping Song, Minzheng Duan, Hailei Liu, Juan Huo and Congzheng Han
Remote Sens. 2025, 17(7), 1123; https://doi.org/10.3390/rs17071123 - 21 Mar 2025
Cited by 6 | Viewed by 4721
Abstract
Nowcasting is a critical technology for disaster prevention and mitigation, and the accuracy of radar echo extrapolation directly impacts forecasting performance. In most deep learning-based models, accurately predicting heavy precipitation remains a challenging task. Focusing on the region of China, this study proposes [...] Read more.
Nowcasting is a critical technology for disaster prevention and mitigation, and the accuracy of radar echo extrapolation directly impacts forecasting performance. In most deep learning-based models, accurately predicting heavy precipitation remains a challenging task. Focusing on the region of China, this study proposes an improved model based on residual and attention mechanisms—RA-UNet—for precipitation nowcasting with a lead time of 3 h. The model introduces the residual neural network (ResNet) and the convolutional block attention module (CBAM) to integrate multi-scale features into the U-Net encoder–decoder architecture, enhancing its ability to capture the spatiotemporal evolution of precipitation systems. Meanwhile, depthwise separable convolutions are employed to replace conventional convolutions, significantly improving computational efficiency while preserving model performance. To evaluate the model’s performance, experiments were conducted using 6 min resolution radar echo data from China in 2024, with comparisons made against the optical flow (OF) method and the U-Net model. The experimental results show that RA-UNet demonstrates significant advantages in 3 h forecasting: its mean absolute error (MAE) is reduced by approximately 7%, the false alarm rate (FAR) decreases by about 20%, and it outperforms the comparison models in metrics such as the critical success index (CSI) and structural similarity index (SSIM). Notably, RA-UNet effectively mitigates intensity degradation in long-term forecasts, successfully predicting the trend of >40 dBZ strong echo cores in two typical cases and significantly improving the premature dissipation problem of precipitation fields. This study provides a new approach to refined forecasting of complex precipitation systems, and future work will combine multi-source data fusion with physical constraint mechanisms to further enhance precipitation event prediction capabilities. Full article
Show Figures

Figure 1

Back to TopTop