Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (24)

Search Parameters:
Keywords = symmetric dual-attention mechanism

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 883 KB  
Article
SDA-Net: A Symmetric Dual-Attention Network with Multi-Scale Convolution for MOOC Dropout Prediction
by Yiwen Yang, Chengjun Xu and Guisheng Tian
Symmetry 2026, 18(1), 202; https://doi.org/10.3390/sym18010202 - 21 Jan 2026
Abstract
With the rapid development of Massive Open Online Courses (MOOCs), high dropout rates have become a major challenge, limiting the quality of online education and the effectiveness of targeted interventions. Although existing MOOC dropout prediction methods have incorporated deep learning and attention mechanisms [...] Read more.
With the rapid development of Massive Open Online Courses (MOOCs), high dropout rates have become a major challenge, limiting the quality of online education and the effectiveness of targeted interventions. Although existing MOOC dropout prediction methods have incorporated deep learning and attention mechanisms to improve predictive performance to some extent, they still face limitations in modeling differences in course difficulty and learning engagement, capturing multi-scale temporal learning behaviors, and controlling model complexity. To address these issues, this paper proposes a MOOC dropout prediction model that integrates multi-scale convolution with a symmetric dual-attention mechanism, termed SDA-Net. In the feature modeling stage, the model constructs a time allocation ratio matrix (MRatio), a resource utilization ratio matrix (SRatio), and a relative group-level ranking matrix (Rank) to characterize learners’ behavioral differences in terms of time investment, resource usage structure, and relative performance, thereby mitigating the impact of course difficulty and individual effort disparities on prediction outcomes. Structurally, SDA-Net extracts learning behavior features at different temporal scales through multi-scale convolution and incorporates a symmetric dual-attention mechanism composed of spatial and channel attention to adaptively focus on information highly correlated with dropout risk, enhancing feature representation while maintaining a relatively lightweight architecture. Experimental results on the KDD Cup 2015 and XuetangX public datasets demonstrate that SDA-Net achieves more competitive performance than traditional machine learning methods, mainstream deep learning models, and attention-based approaches on major evaluation metrics; in particular, it attains an accuracy of 93.7% on the KDD Cup 2015 dataset and achieves an absolute improvement of 0.2 percentage points in Accuracy and 0.4 percentage points in F1-Score on the XuetangX dataset, confirming that the proposed model effectively balances predictive performance and model complexity. Full article
(This article belongs to the Section Computer)
19 pages, 5302 KB  
Article
LSSCC-Net: Integrating Spatial-Feature Aggregation and Adaptive Attention for Large-Scale Point Cloud Semantic Segmentation
by Wenbo Wang, Xianghong Hua, Cheng Li, Pengju Tian, Yapeng Wang and Lechao Liu
Symmetry 2026, 18(1), 124; https://doi.org/10.3390/sym18010124 - 8 Jan 2026
Viewed by 217
Abstract
Point cloud semantic segmentation is a key technology for applications such as autonomous driving, robotics, and virtual reality. Current approaches are heavily reliant on local relative coordinates and simplistic attention mechanisms to aggregate neighborhood information. This often leads to an ineffective joint representation [...] Read more.
Point cloud semantic segmentation is a key technology for applications such as autonomous driving, robotics, and virtual reality. Current approaches are heavily reliant on local relative coordinates and simplistic attention mechanisms to aggregate neighborhood information. This often leads to an ineffective joint representation of geometric perturbations and feature variations, coupled with a lack of adaptive selection for salient features during context fusion. On this basis, we propose LSSCC-Net, a novel segmentation framework based on LACV-Net. First, the spatial-feature dynamic aggregation module is designed to fuse offset information by symmetric interaction between spatial positions and feature channels, thus supplementing local structural information. Second, a dual-dimensional attention mechanism (spatial and channel) is introduced to symmetrically deploy attention modules in both the encoder and decoder, prioritizing salient information extraction. Finally, Lovász-Softmax Loss is used as an auxiliary loss to optimize the training objective. The proposed method is evaluated on two public benchmark datasets. The mIoU on the Toronto3D and S3DIS datasets is 83.6% and 65.2%, respectively. Compared with the baseline LACV-Net, LSSCC-Net showed notable improvements in challenging categories: the IoU for “road mark” and “fence” on Toronto3D increased by 3.6% and 8.1%, respectively. These results indicate that LSSCC-Net more accurately characterizes complex boundaries and fine-grained structures, enhancing segmentation capabilities for small-scale targets and category boundaries. Full article
Show Figures

Figure 1

17 pages, 779 KB  
Article
Geometry Diagram Parsing and Reasoning Based on Deep Semantic Fusion
by Pengpeng Jian, Xuhui Zhang, Lei Wu, Bin Ma and Wangyang Hong
Symmetry 2026, 18(1), 92; https://doi.org/10.3390/sym18010092 - 4 Jan 2026
Viewed by 314
Abstract
Effective Automated Geometric Problem Solving (AGP) requires a deep integration of visual perception and textual comprehension. To address this, we propose a dual-stream fusion model that injects deep semantic understanding from a Pre-trained Language Model (PLM) into the geometric diagram parsing pipeline. Our [...] Read more.
Effective Automated Geometric Problem Solving (AGP) requires a deep integration of visual perception and textual comprehension. To address this, we propose a dual-stream fusion model that injects deep semantic understanding from a Pre-trained Language Model (PLM) into the geometric diagram parsing pipeline. Our core innovation is a Semantic-Guided Cross-Attention (SGCA) mechanism, which uses the global semantic intent of the problem text to direct attention toward key visual primitives. This yields context-enriched visual representations that serve as inputs to a Graph Neural Network (GNN), enabling relational reasoning that is not only perception-driven but also context-aware. By explicitly bridging the semantic gap between text and diagrams, our approach delivers more robust and accurate predictions. To the best of our knowledge, this is the first study to introduce a semantic-guided cross-attention mechanism into geometric diagram parsing, establishing a new paradigm that effectively addresses the cross-modal semantic gap and achieves state-of-the-art performance. This is particularly effective for parsing problems involving geometric symmetries, where textual cues often clarify or define symmetrical relationships not obvious from the diagram alone. Full article
(This article belongs to the Special Issue Symmetry and Asymmetry in Human-Computer Interaction)
Show Figures

Figure 1

15 pages, 1974 KB  
Article
A Dual-Path Fusion Network with Edge Feature Enhancement for Medical Image Segmentation
by Liangxu Shi, Weiyuan He and Guodong Wang
Mathematics 2026, 14(1), 55; https://doi.org/10.3390/math14010055 - 24 Dec 2025
Viewed by 357
Abstract
This paper proposes a Dual-path Feature-enhanced Fusion Network (DPF-Net) for medical image segmentation to address limitations in existing methods, including insufficient edge feature extraction, semantic gaps among multi-scale encoder features, and significant semantic disparities between the encoder and decoder in U-Net architectures. To [...] Read more.
This paper proposes a Dual-path Feature-enhanced Fusion Network (DPF-Net) for medical image segmentation to address limitations in existing methods, including insufficient edge feature extraction, semantic gaps among multi-scale encoder features, and significant semantic disparities between the encoder and decoder in U-Net architectures. To this end, we design a symmetric encoder–decoder structure based on U-Net and introduce three core modules: the Edge Feature Gating (EFG) module, which extracts and integrates edge features from shallow encoder layers to enhance edge integrity; the Cross-channel Fusion Transformer (CCFT) module, embedded in skip connections, to achieve comprehensive and semantically balanced multi-scale cross-fused features; and the Dual-path Feature Fusion (DPFM) module, which combines channel and spatial attention mechanisms to effectively bridge the semantic gap between encoder and decoder features while improving spatial resolution recovery accuracy. Experimental results demonstrate that DPF-Net achieves superior performance on six public datasets covering gland segmentation, colon polyp segmentation, and skin lesion segmentation tasks, significantly outperforming existing methods in terms of both mDice and IoU metrics. The conclusions confirm that the proposed method not only comprehensively improves the overall accuracy and edge segmentation quality of medical image segmentation but also enhances the model’s generalization capability across different tasks. Full article
Show Figures

Figure 1

26 pages, 5797 KB  
Article
ASGT-Net: A Multi-Modal Semantic Segmentation Network with Symmetric Feature Fusion and Adaptive Sparse Gating
by Wendie Yue, Kai Chang, Xinyu Liu, Kaijun Tan and Wenqian Chen
Symmetry 2025, 17(12), 2070; https://doi.org/10.3390/sym17122070 - 3 Dec 2025
Viewed by 481
Abstract
In the field of remote sensing, accurate semantic segmentation is crucial for applications such as environmental monitoring and urban planning. Effective fusion of multi-modal data is a key factor in improving land cover classification accuracy. To address the limitations of existing methods, such [...] Read more.
In the field of remote sensing, accurate semantic segmentation is crucial for applications such as environmental monitoring and urban planning. Effective fusion of multi-modal data is a key factor in improving land cover classification accuracy. To address the limitations of existing methods, such as inadequate feature fusion, noise interference, and insufficient modeling of long-range dependencies, this paper proposes ASGT-Net, an enhanced multi-modal fusion network. The network adopts an encoder-decoder architecture, with the encoder featuring a symmetric dual-branch structure based on a ResNet50 backbone and a hierarchical feature extraction framework. At each layer, Adaptive Weighted Fusion (AWF) modules are introduced to dynamically adjust the feature contributions from different modalities. Additionally, this paper innovatively introduces an alternating mechanism of Learnable Sparse Attention (LSA) and Adaptive Gating Fusion (AGF): LSA selectively activates salient features to capture critical spatial contextual information, while AGF adaptively gates multi-modal data flows to suppress common conflicting noise. These mechanisms work synergistically to significantly enhance feature integration, improve multi-scale representation, and reduce computational redundancy. Experiments on the ISPRS benchmark datasets (Vaihingen and Potsdam) demonstrate that ASGT-Net outperforms current mainstream multi-modal fusion techniques in both accuracy and efficiency. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

24 pages, 9828 KB  
Article
A Novel Object Detection Algorithm Combined YOLOv11 with Dual-Encoder Feature Aggregation
by Haisong Chen, Pengfei Yuan, Wenbai Liu, Fuling Li and Aili Wang
Sensors 2025, 25(23), 7270; https://doi.org/10.3390/s25237270 - 28 Nov 2025
Cited by 1 | Viewed by 739
Abstract
To address the limitations of unimodal visual detection in complex scenarios involving low illumination, occlusion, and texture-sparse environments, this paper proposes an improved YOLOv11-based dual-branch RGB-D fusion framework. The symmetric architecture processes RGB images and depth maps in parallel, integrating a Dual-Encoder Cross-Attention [...] Read more.
To address the limitations of unimodal visual detection in complex scenarios involving low illumination, occlusion, and texture-sparse environments, this paper proposes an improved YOLOv11-based dual-branch RGB-D fusion framework. The symmetric architecture processes RGB images and depth maps in parallel, integrating a Dual-Encoder Cross-Attention (DECA) module for cross-modal feature weighting and a Dual-Encoder Feature Aggregation (DEPA) module for hierarchical fusion—where the RGB branch captures texture semantics while the depth branch extracts geometric priors. To comprehensively validate the effectiveness and generalization capability of the proposed framework, we designed a multi-stage evaluation strategy leveraging complementary benchmark datasets. On the M3FD dataset, the model was evaluated under both RGB-depth and RGB-infrared configurations to verify core fusion performance and extensibility to diverse modalities. Additionally, the VOC2007 dataset was augmented with pseudo-depth maps generated by Depth Anything, assessing adaptability under monocular input constraints. Experimental results demonstrate that our method achieves mAP50 scores of 82.59% on VOC2007 and 81.14% on M3FD in RGB-infrared mode, outperforming the baseline YOLOv11 by 5.06% and 9.15%, respectively. Notably, in the RGB-depth configuration on M3FD, the model attains a mAP50 of 77.37% with precision of 88.91%, highlighting its robustness in geometric-aware detection tasks. Ablation studies confirm the critical roles of the Dynamic Branch Enhancement (DBE) module in adaptive feature calibration and the Dual-Encoder Attention (DEA) mechanism in multi-scale fusion, significantly enhancing detection stability under challenging conditions. With only 2.47M parameters, the framework provides an efficient and scalable solution for high-precision spatial perception in autonomous driving and robotics applications. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

24 pages, 7681 KB  
Review
Research Progress on Molecularly Imprinted Polymer-Aptasensors for Food Safety Detection
by Jiuyi Wang, Xiaogang Lin, Jinyu Wu, Xiao Lv, Binji Dai, Ke Wang and Jayne Wu
Symmetry 2025, 17(11), 1933; https://doi.org/10.3390/sym17111933 - 11 Nov 2025
Viewed by 528
Abstract
The biological accumulation of microcontaminants and associated antibiotic resistance in food poses significant threats to both human and environmental health. Therefore, it is particularly crucial to design and develop methods of efficient identification and detection. Recently, molecularly imprinted polymers (MIPs) and aptamers (Apts), [...] Read more.
The biological accumulation of microcontaminants and associated antibiotic resistance in food poses significant threats to both human and environmental health. Therefore, it is particularly crucial to design and develop methods of efficient identification and detection. Recently, molecularly imprinted polymers (MIPs) and aptamers (Apts), as novel hybrid recognition elements, have received widespread attention from researchers. Because the dual recognition-based sensors have demonstrated enhanced performance and desirable characteristics, including high sensitivity, strong binding affinity, a low detection limit, and excellent stability under harsh environmental conditions, which are expected to be applied in food safety fields. This paper compares the characteristics of MIP and Apt, highlighting the significant advantages of molecularly imprinted polymer–aptamer (MIP-Apt) dual recognition in selectivity, sensitivity, and stability, which stems from their symmetric integration, akin to an extension of the ‘lock-and-key’ model. It then systematically discusses three synthetic strategies for MIP-Apt hybrid recognition systems and their applications for food safety detection, focusing on analyzing their detection strategies, sensing mechanisms, construction methodologies, performance evaluations, and potential application value. It also offers substantive perspectives on both the prevailing limitations and promising developmental pathways for MIP-Apt hybrid recognition-based sensing platforms. Full article
(This article belongs to the Special Issue Symmetry in Biosensors)
Show Figures

Figure 1

22 pages, 3487 KB  
Article
Research and Optimization of Ultra-Short-Term Photovoltaic Power Prediction Model Based on Symmetric Parallel TCN-TST-BiGRU Architecture
by Tengjie Wang, Zian Gong, Zhiyuan Wang, Yuxi Liu, Yahong Ma, Feng Wang and Jing Li
Symmetry 2025, 17(11), 1855; https://doi.org/10.3390/sym17111855 - 3 Nov 2025
Viewed by 463
Abstract
(1) Background: Ultra-short-term photovoltaic (PV) power prediction is crucial for optimizing grid scheduling and enhancing energy utilization efficiency. Existing prediction methods face challenges of missing data, noise interference, and insufficient accuracy. (2) Methods: This study proposes a single-step hybrid neural network model integrating [...] Read more.
(1) Background: Ultra-short-term photovoltaic (PV) power prediction is crucial for optimizing grid scheduling and enhancing energy utilization efficiency. Existing prediction methods face challenges of missing data, noise interference, and insufficient accuracy. (2) Methods: This study proposes a single-step hybrid neural network model integrating Temporal Convolutional Network (TCN), Temporal Shift Transformer (TST), and Bidirectional Gated Recurrent Unit (BiGRU) to achieve high-precision 15-minute-ahead PV power prediction, with a design aligned with symmetry principles. Data preprocessing uses Variational Mode Decomposition (VMD) and random forest interpolation to suppress noise and repair missing values. A symmetric parallel dual-branch feature extraction module is built: TCN-TST extracts local dynamics and long-term dependencies, while BiGRU captures global features. This symmetric structure matches the intra-day periodic symmetry of PV power (e.g., symmetric irradiance patterns around noon) and avoids bias from single-branch models. Tensor concatenation and an adaptive attention mechanism realize feature fusion and dynamic weighted output. (3) Results: Experiments on real data from a Xinjiang PV power station, with hyperparameter optimization (BiGRU units, activation function, TCN kernels, TST parameters), show that the model outperforms comparative models in MAE and R2—e.g., the MAE is 26.53% and 18.41% lower than that of TCN and Transforme. (4) Conclusions: The proposed method achieves a balance between accuracy and computational efficiency. It provides references for PV station operation, system scheduling, and grid stability. Full article
(This article belongs to the Section Engineering and Materials)
Show Figures

Figure 1

19 pages, 7270 KB  
Article
A Fast Rotation Detection Network with Parallel Interleaved Convolutional Kernels
by Leilei Deng, Lifeng Sun and Hua Li
Symmetry 2025, 17(10), 1621; https://doi.org/10.3390/sym17101621 - 1 Oct 2025
Viewed by 499
Abstract
In recent years, convolutional neural network-based object detectors have achieved extensive applications in remote sensing (RS) image interpretation. While multi-scale feature modeling optimization remains a persistent research focus, existing methods frequently overlook the symmetrical balance between feature granularity and morphological diversity, particularly when [...] Read more.
In recent years, convolutional neural network-based object detectors have achieved extensive applications in remote sensing (RS) image interpretation. While multi-scale feature modeling optimization remains a persistent research focus, existing methods frequently overlook the symmetrical balance between feature granularity and morphological diversity, particularly when handling high-aspect-ratio RS targets with anisotropic geometries. This oversight leads to suboptimal feature representations characterized by spatial sparsity and directional bias. To address this challenge, we propose the Parallel Interleaved Convolutional Kernel Network (PICK-Net), a rotation-aware detection framework that embodies symmetry principles through dual-path feature modulation and geometrically balanced operator design. The core innovation lies in the synergistic integration of cascaded dynamic sparse sampling and symmetrically decoupled feature modulation, enabling adaptive morphological modeling of RS targets. Specifically, the Parallel Interleaved Convolution (PIC) module establishes symmetric computation patterns through mirrored kernel arrangements, effectively reducing computational redundancy while preserving directional completeness through rotational symmetry-enhanced receptive field optimization. Complementing this, the Global Complementary Attention Mechanism (GCAM) introduces bidirectional symmetry in feature recalibration, decoupling channel-wise and spatial-wise adaptations through orthogonal attention pathways that maintain equilibrium in gradient propagation. Extensive experiments on RSOD and NWPU-VHR-10 datasets demonstrate our superior performance, achieving 92.2% and 84.90% mAP, respectively, outperforming state-of-the-art methods including EfficientNet and YOLOv8. With only 12.5 M parameters, the framework achieves symmetrical optimization of accuracy-efficiency trade-offs. Ablation studies confirm that the symmetric interaction between PIC and GCAM enhances detection performance by 2.75%, particularly excelling in scenarios requiring geometric symmetry preservation, such as dense target clusters and extreme scale variations. Cross-domain validation on agricultural pest datasets further verifies its rotational symmetry generalization capability, demonstrating 84.90% accuracy in fine-grained orientation-sensitive detection tasks. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

24 pages, 2616 KB  
Article
Symmetric Affix–Context Co-Attention: A Dual-Gating Framework for Robust POS Tagging in Low-Resource MRLs
by Yuan Qi, Samat Ali and Alim Murat
Symmetry 2025, 17(9), 1561; https://doi.org/10.3390/sym17091561 - 18 Sep 2025
Viewed by 860
Abstract
Part-of-speech (POS) tagging in low-resource, morphologically rich languages (LRLs/MRLs) remains challenging due to extensive affixation, high out-of-vocabulary (OOV) rates, and pervasive polysemy. We propose MRL-POS, a unified Transformer-CRF framework that dynamically selects informative affix features and integrates them with deep contextual embeddings via [...] Read more.
Part-of-speech (POS) tagging in low-resource, morphologically rich languages (LRLs/MRLs) remains challenging due to extensive affixation, high out-of-vocabulary (OOV) rates, and pervasive polysemy. We propose MRL-POS, a unified Transformer-CRF framework that dynamically selects informative affix features and integrates them with deep contextual embeddings via a novel dual-gating co-attention mechanism. First, a Dynamic Affix Selector adaptively adjusts n-gram ranges and frequency thresholds based on word length to ensure high-precision affix segmentation. Second, the Affix–Context Co-Attention Module employs two gating functions that conditionally amplify contextual dimensions with affix cues and vice versa, enabling robust disambiguation of complex and ambiguous forms. Third, Layer-Wise Attention Pooling aggregates multi-layer XLM-RoBERTa representations, emphasizing those most relevant for morphological and syntactic tagging. Evaluations on Uyghur, Kyrgyz, and Uzbek show that MRL-POS achieves an average F1 of 84.10%, OOV accuracy of 84.24%, and Poly-F1 of 72.14%, outperforming strong baselines by up to 8 F1 points. By explicitly modeling the symmetry between morphological affix cues and sentence-level context through a dual-gating co-attention mechanism, MRL-POS achieves a balanced fusion that both preserves local structure and captures global dependencies. Interpretability analyses confirm that 89.1% of the selected affixes align with linguistic expectations. This symmetric design not only enhances robustness in low-resource and agglutinative settings but also offers a general paradigm for symmetry-aware sequence labeling tasks. Full article
Show Figures

Figure 1

22 pages, 1243 KB  
Article
ProCo-NET: Progressive Strip Convolution and Frequency- Optimized Framework for Scale-Gradient-Aware Semantic Segmentation in Off-Road Scenes
by Zihang Liu, Donglin Jing and Chenxiang Ji
Symmetry 2025, 17(9), 1428; https://doi.org/10.3390/sym17091428 - 2 Sep 2025
Viewed by 763
Abstract
In off-road scenes, segmentation targets exhibit significant scale progression due to perspective depth effects from oblique viewing angles, meaning that the size of the same target undergoes continuous, boundary-less progressive changes along a specific direction. This asymmetric variation disrupts the geometric symmetry of [...] Read more.
In off-road scenes, segmentation targets exhibit significant scale progression due to perspective depth effects from oblique viewing angles, meaning that the size of the same target undergoes continuous, boundary-less progressive changes along a specific direction. This asymmetric variation disrupts the geometric symmetry of targets, causing traditional segmentation networks to face three key challenges: (1) inefficientcapture of continuous-scale features, where pyramid structures and multi-scale kernels struggle to balance computational efficiency with sufficient coverage of progressive scales; (2) degraded intra-class feature consistency, where local scale differences within targets induce semantic ambiguity; and (3) loss of high-frequency boundary information, where feature sampling operations exacerbate the blurring of progressive boundaries. To address these issues, this paper proposes the ProCo-NET framework for systematic optimization. Firstly, a Progressive Strip Convolution Group (PSCG) is designed to construct multi-level receptive field expansion through orthogonally oriented strip convolution cascading (employing symmetric processing in horizontal/vertical directions) integrated with self-attention mechanisms, enhancing perception capability for asymmetric continuous-scale variations. Secondly, an Offset-Frequency Cooperative Module (OFCM) is developed wherein a learnable offset generator dynamically adjusts sampling point distributions to enhance intra-class consistency, while a dual-channel frequency domain filter performs adaptive high-pass filtering to sharpen target boundaries. These components synergistically solve feature consistency degradation and boundary ambiguity under asymmetric changes. Experiments show that this framework significantly improves the segmentation accuracy and boundary clarity of multi-scale targets in off-road scene segmentation tasks: it achieves 71.22% MIoU on the standard RUGD dataset (0.84% higher than the existing optimal method) and 83.05% MIoU on the Freiburg_Forest dataset. Among them, the segmentation accuracy of key obstacle categories is significantly improved to 52.04% (2.7% higher than the sub-optimal model). This framework effectively compensates for the impact of asymmetric deformation through a symmetric computing mechanism. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

23 pages, 6490 KB  
Article
LISA-YOLO: A Symmetry-Guided Lightweight Small Object Detection Framework for Thyroid Ultrasound Images
by Guoqing Fu, Guanghua Gu, Wen Liu and Hao Fu
Symmetry 2025, 17(8), 1249; https://doi.org/10.3390/sym17081249 - 6 Aug 2025
Cited by 1 | Viewed by 1133
Abstract
Non-invasive ultrasound diagnosis, combined with deep learning, is frequently used for detecting thyroid diseases. However, real-time detection on portable devices faces limitations due to constrained computational resources, and existing models often lack sufficient capability for small object detection of thyroid nodules. To address [...] Read more.
Non-invasive ultrasound diagnosis, combined with deep learning, is frequently used for detecting thyroid diseases. However, real-time detection on portable devices faces limitations due to constrained computational resources, and existing models often lack sufficient capability for small object detection of thyroid nodules. To address this, this paper proposes an improved lightweight small object detection network framework called LISA-YOLO, which enhances the lightweight multi-scale collaborative fusion algorithm. The proposed framework exploits the inherent symmetrical characteristics of ultrasound images and the symmetrical architecture of the detection network to better capture and represent features of thyroid nodules. Specifically, an improved depthwise separable convolution algorithm replaces traditional convolution to construct a lightweight network (DG-FNet). Through symmetrical cross-scale fusion operations via FPN, detection accuracy is maintained while reducing computational overhead. Additionally, an improved bidirectional feature network (IMS F-NET) fully integrates the semantic and detailed information of high- and low-level features symmetrically, enhancing the representation capability for multi-scale features and improving the accuracy of small object detection. Finally, a collaborative attention mechanism (SAF-NET) uses a dual-channel and spatial attention mechanism to adaptively calibrate channel and spatial weights in a symmetric manner, effectively suppressing background noise and enabling the model to focus on small target areas in thyroid ultrasound images. Extensive experiments on two image datasets demonstrate that the proposed method achieves improvements of 2.3% in F1 score, 4.5% in mAP, and 9.0% in FPS, while maintaining only 2.6 M parameters and reducing GFLOPs from 6.1 to 5.8. The proposed framework provides significant advancements in lightweight real-time detection and demonstrates the important role of symmetry in enhancing the performance of ultrasound-based thyroid diagnosis. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

16 pages, 4587 KB  
Article
FAMNet: A Lightweight Stereo Matching Network for Real-Time Depth Estimation in Autonomous Driving
by Jingyuan Zhang, Qiang Tong, Na Yan and Xiulei Liu
Symmetry 2025, 17(8), 1214; https://doi.org/10.3390/sym17081214 - 1 Aug 2025
Viewed by 2482
Abstract
Accurate and efficient stereo matching is fundamental to real-time depth estimation from symmetric stereo cameras in autonomous driving systems. However, existing high-accuracy stereo matching networks typically rely on computationally expensive 3D convolutions, which limit their practicality in real-world environments. In contrast, real-time methods [...] Read more.
Accurate and efficient stereo matching is fundamental to real-time depth estimation from symmetric stereo cameras in autonomous driving systems. However, existing high-accuracy stereo matching networks typically rely on computationally expensive 3D convolutions, which limit their practicality in real-world environments. In contrast, real-time methods often sacrifice accuracy or generalization capability. To address these challenges, we propose FAMNet (Fusion Attention Multi-Scale Network), a lightweight and generalizable stereo matching framework tailored for real-time depth estimation in autonomous driving applications. FAMNet consists of two novel modules: Fusion Attention-based Cost Volume (FACV) and Multi-scale Attention Aggregation (MAA). FACV constructs a compact yet expressive cost volume by integrating multi-scale correlation, attention-guided feature fusion, and channel reweighting, thereby reducing reliance on heavy 3D convolutions. MAA further enhances disparity estimation by fusing multi-scale contextual cues through pyramid-based aggregation and dual-path attention mechanisms. Extensive experiments on the KITTI 2012 and KITTI 2015 benchmarks demonstrate that FAMNet achieves a favorable trade-off between accuracy, efficiency, and generalization. On KITTI 2015, with the incorporation of FACV and MAA, the prediction accuracy of the baseline model is improved by 37% and 38%, respectively, and a total improvement of 42% is achieved by our final model. These results highlight FAMNet’s potential for practical deployment in resource-constrained autonomous driving systems requiring real-time and reliable depth perception. Full article
Show Figures

Figure 1

21 pages, 5616 KB  
Article
Symmetry-Guided Dual-Branch Network with Adaptive Feature Fusion and Edge-Aware Attention for Image Tampering Localization
by Zhenxiang He, Le Li and Hanbin Wang
Symmetry 2025, 17(7), 1150; https://doi.org/10.3390/sym17071150 - 18 Jul 2025
Cited by 1 | Viewed by 973
Abstract
When faced with diverse types of image tampering and image quality degradation in real-world scenarios, traditional image tampering localization methods often struggle to balance boundary accuracy and robustness. To address these issues, this paper proposes a symmetric guided dual-branch image tampering localization network—FENet [...] Read more.
When faced with diverse types of image tampering and image quality degradation in real-world scenarios, traditional image tampering localization methods often struggle to balance boundary accuracy and robustness. To address these issues, this paper proposes a symmetric guided dual-branch image tampering localization network—FENet (Fusion-Enhanced Network)—that integrates adaptive feature fusion and edge attention mechanisms. This method is based on a structurally symmetric dual-branch architecture, which extracts RGB semantic features and SRM noise residual information to comprehensively capture the fine-grained differences in tampered regions at the visual and statistical levels. To effectively fuse different features, this paper designs a self-calibrating fusion module (SCF), which introduces a content-aware dynamic weighting mechanism to adaptively adjust the importance of different feature branches, thereby enhancing the discriminative power and expressiveness of the fused features. Furthermore, considering that image tampering often involves abnormal changes in edge structures, we further propose an edge-aware coordinate attention mechanism (ECAM). By jointly modeling spatial position information and edge-guided information, the model is guided to focus more precisely on potential tampering boundaries, thereby enhancing its boundary detection and localization capabilities. Experiments on public datasets such as Columbia, CASIA, and NIST16 demonstrate that FENet achieves significantly better results than existing methods. We also analyze the model’s performance under various image quality conditions, such as JPEG compression and Gaussian blur, demonstrating its robustness in real-world scenarios. Experiments in Facebook, Weibo, and WeChat scenarios show that our method achieves average F1 scores that are 2.8%, 3%, and 5.6% higher than those of existing state-of-the-art methods, respectively. Full article
Show Figures

Figure 1

26 pages, 5237 KB  
Article
A Bridge Defect Detection Algorithm Based on UGMB Multi-Scale Feature Extraction and Fusion
by Haiyan Zhang, Chao Tian, Ao Zhang, Yilin Liu, Guxue Gao, Zhiwen Zhuang, Tongtong Yin and Nuo Zhang
Symmetry 2025, 17(7), 1025; https://doi.org/10.3390/sym17071025 - 30 Jun 2025
Cited by 2 | Viewed by 1304
Abstract
Aiming at the problems of leakage and misdetection caused by insufficient multi-scale feature extraction and an excessive amount of model parameters in bridge defect detection, this paper proposes the AMSF-Pyramid-YOLOv11n model. First, a Cooperative Optimization Module (COPO) is introduced, which consists of the [...] Read more.
Aiming at the problems of leakage and misdetection caused by insufficient multi-scale feature extraction and an excessive amount of model parameters in bridge defect detection, this paper proposes the AMSF-Pyramid-YOLOv11n model. First, a Cooperative Optimization Module (COPO) is introduced, which consists of the designed multi-level dilated shared convolution (FPSharedConv) and a dual-domain attention block. Through the joint optimization of FPSharedConv and a CGLU gating mechanism, the module significantly improves feature extraction efficiency and learning capability. Second, the Unified Global-Multiscale Bottleneck (UGMB) multi-scale feature pyramid designed in this study efficiently integrates the FCGL_MANet, WFU, and HAFB modules. By leveraging the symmetry of Haar wavelet decomposition combined with local-global attention, this module effectively addresses the challenge of multi-scale feature fusion, enhancing the model’s ability to capture both symmetrical and asymmetrical bridge defect patterns. Finally, an optimized lightweight detection head (LCB_Detect) is employed, which reduces the parameter count by 6.35% through shared convolution layers and separate batch normalization. Experimental results show that the proposed model achieves a mean average precision (mAP@0.5) of 60.3% on a self-constructed bridge defect dataset, representing an improvement of 11.3% over the baseline YOLOv11n. The model effectively reduces the false positive rate while improving the detection accuracy of bridge defects. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

Back to TopTop