Deep Learning and Deep Learning Synergy of Transformers and Symmetry in Small Object Detection and Tracking

A special issue of Symmetry (ISSN 2073-8994).

Deadline for manuscript submissions: 31 May 2026 | Viewed by 9905

Special Issue Editors


E-Mail Website
Guest Editor
School of Automation and Software Engineering, Shanxi University, Taiyuan 030006, China
Interests: image processing; artificial intelligence;deep learning; target detection;pattern recognition; target recognition
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
Interests: wireless resource allocation and management; wireless communications and networking; dynamic game and mean field game theory; big data analysis; security
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor Assistant
School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100811, China
Interests: medical image processing; deep learning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Small-object detection and tracking is one of the most challenging and fundamental topics in computer vision. These objects are typically characterized by their small size, low contrast, and blurred edges, which collectively complicate the tasks of detection and tracking. In recent years, advancements in this field have facilitated the application of small-object detection and tracking in remote sensing imagery across diverse domains, including mineral exploration, precision agriculture, urban planning, forestry management, military target identification, and disaster assessment. Despite these advancements, several critical challenges persist in real-world applications. Key issues include the difficulty of extracting detailed information from small objects, the trade-off between detection accuracy and computational efficiency, the ability to identify unknown or untrained categories within image data, and the effective tracking of small objects over time. Addressing these challenges is essential for advancing the practical utility of small-object detection and tracking systems.

Therefore, we invite submissions of papers on theoretical research and practical applications related to the deep learning and deep learning synergy of transformers and symmetry architecture for small-object detection related to image processing.

Prof. Dr. Fengping An
Prof. Dr. Haitao Xu
Guest Editors

Dr. Chuyang Ye
Guest Editor Assistant

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Symmetry is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • image processing
  • object recognition
  • artificial intelligence

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

23 pages, 9399 KB  
Article
Restoring Geometric and Probabilistic Symmetry for Tiny Football Localization in Dynamic Environments
by Hongyang Liu, Longying Wang, Qiang Zheng, Gang Zhao and Huiteng Xu
Symmetry 2026, 18(4), 587; https://doi.org/10.3390/sym18040587 - 30 Mar 2026
Viewed by 431
Abstract
The precise identification of minute, high-velocity entities within unconstrained visual fields represents a significant hurdle in computational perception. This difficulty primarily arises from the geometric degradation stemming from scale volatility, motion-induced asymmetry, and heterogeneous background clutter. To mitigate the critical deficit of high-fidelity [...] Read more.
The precise identification of minute, high-velocity entities within unconstrained visual fields represents a significant hurdle in computational perception. This difficulty primarily arises from the geometric degradation stemming from scale volatility, motion-induced asymmetry, and heterogeneous background clutter. To mitigate the critical deficit of high-fidelity benchmarks for dynamic micro-targets, we present Soccer-Wild. This comprehensive dataset is characterized by the extreme visual complexity of microscopic objects in diverse ecological settings. Built upon this empirical foundation, we introduce GOAL (Global Object Alignment for Localization). This novel computational paradigm is designed to enhance the weak features of tiny targets by integrating frequency-domain filtering, dynamic feature routing, and entropy-guided probabilistic modeling. The GOAL framework rigorously preserves spatial-structural equilibrium and information fidelity through three synergetic mechanisms: (1) Spectral Purification: We implement a Frequency-aware Spectral Gating approach that operates in the Fourier manifold, suppressing stochastic noise to accentuate the spectral signatures of the targets; (2) Geometric Adaptation: A Multi-Granularity Mixture of Experts (MG-MoE) is formulated with heterogeneous receptive fields to dynamically rectify anisotropic distortions caused by kinetic blurring. This adaptive routing ensures cross-state representation consistency; (3) Information Recovery: We propose Information-Guided Gaussian Distribution Estimation (IGDE), which utilizes information entropy to conceptualize target coordinates as radially symmetric probability densities. This facilitates the implicit recovery of latent signals typically discarded by rigid deterministic regression. Empirical validations on the Soccer-Wild and VisDrone2019 benchmarks reveal that the proposed methodology yields substantial gains in precision. Specifically, our model achieves 40.0% and 40.4% AP (Average Precision), respectively, establishing a new state-of-the-art for localizing highly dynamic, micro-scale objects. Full article
Show Figures

Figure 1

28 pages, 5526 KB  
Article
Symmetry-Aware SwinUNet with Integrated Attention for Transformer-Based Segmentation of Thyroid Ultrasound Images
by Ammar Oad, Imtiaz Hussain Koondhar, Feng Dong, Weibing Liu, Beiji Zou, Weichun Liu, Yun Chen and Yaoqun Wu
Symmetry 2026, 18(1), 141; https://doi.org/10.3390/sym18010141 - 10 Jan 2026
Viewed by 752
Abstract
Accurate segmentation of thyroid nodules in ultrasound images remains challenging due to low contrast, speckle noise, and inter-patient variability that disrupt the inherent spatial symmetry of thyroid anatomy. This study proposes a symmetry-aware SwinUNet framework with integrated spatial attention for thyroid nodule segmentation. [...] Read more.
Accurate segmentation of thyroid nodules in ultrasound images remains challenging due to low contrast, speckle noise, and inter-patient variability that disrupt the inherent spatial symmetry of thyroid anatomy. This study proposes a symmetry-aware SwinUNet framework with integrated spatial attention for thyroid nodule segmentation. The hierarchical window-based Swin Transformer encoder preserves spatial symmetry and scale consistency while capturing both global contextual information and fine-grained local features. Attention modules in the decoder emphasize symmetry consistent anatomical regions and asymmetric nodule boundaries, effectively suppressing irrelevant background responses. The proposed method was evaluated on the publicly available TN3K thyroid ultrasound dataset. Experimental results demonstrate strong performance, achieving a Dice Similarity Coefficient of 85.51%, precision of 87.05%, recall of 89.13%, an IoU of 78.00%, accuracy of 97.02%, and an AUC of 99.02%. Compared with the baseline model, the proposed approach improves the IoU and Dice score by 15.38% and 12.05%, respectively, confirming its ability to capture symmetry-preserving nodule morphology and boundary asymmetry. These findings indicate that the proposed symmetry-aware SwinUNet provides a robust and clinically promising solution for thyroid ultrasound image analysis and computer-aided diagnosis. Full article
Show Figures

Figure 1

27 pages, 1255 KB  
Article
CMTA: Infrared Detection Model for Power Facility Components via Multi-Angle Perception and Transattn Fusion
by Zhongyuan Fan, Lufeng Yuan, Biyao Wen, Qiang Liu and Gengkun Wu
Symmetry 2025, 17(11), 1909; https://doi.org/10.3390/sym17111909 - 7 Nov 2025
Cited by 1 | Viewed by 635
Abstract
Infrared detection of defects in power facilities is critical to the safe operation and fault early warning of power grids. However, conventional inspection methods have distinct limitations, such as delayed response and insufficient condition visualization. To address the pain points and technical challenges [...] Read more.
Infrared detection of defects in power facilities is critical to the safe operation and fault early warning of power grids. However, conventional inspection methods have distinct limitations, such as delayed response and insufficient condition visualization. To address the pain points and technical challenges of the aforementioned inspection modes, this study proposes a deep learning network model based on multi-angle perception and Transattn feature fusion. This model can effectively improve the defect recognition ability of power facility components in complex scenarios. Firstly, a modified MAPC module is introduced, which enhances the extraction of edge contours of power facility components and detailed infrared thermal textures. Secondly, an innovative Transattn module is proposed to dynamically focus on the core component regions of power facilities. Finally, a feature fusion strategy is used to efficiently integrate the feature maps from each module, outputting component localization results and defect category information. Experimental results based on the infrared detection dataset of power facility components show that compared with classical detection models such as YOLOv10 and DDN, the proposed CMTA model achieves the best performance in all indicators: the highest mAP50 reaches 85.01%, the frame rate (FPS) is 252 frames per second, the parameter count is only 2.8 M, and it significantly shortens the fault response time of operation and maintenance personnel. Full article
Show Figures

Figure 1

18 pages, 5013 KB  
Article
Enhancing Document Forgery Detection with Edge-Focused Deep Learning
by Yong-Yeol Bae, Dae-Jea Cho and Ki-Hyun Jung
Symmetry 2025, 17(8), 1208; https://doi.org/10.3390/sym17081208 - 30 Jul 2025
Cited by 2 | Viewed by 7403
Abstract
Detecting manipulated document images is essential for verifying the authenticity of official records and preventing document forgery. However, forgery artifacts are often subtle and localized in fine-grained regions, such as text boundaries or character outlines, where visual symmetry and structural regularity are typically [...] Read more.
Detecting manipulated document images is essential for verifying the authenticity of official records and preventing document forgery. However, forgery artifacts are often subtle and localized in fine-grained regions, such as text boundaries or character outlines, where visual symmetry and structural regularity are typically expected. These manipulations can disrupt the inherent symmetry of document layouts, making the detection of such inconsistencies crucial for forgery identification. Conventional CNN-based models face limitations in capturing such edge-level asymmetric features, as edge-related information tends to weaken through repeated convolution and pooling operations. To address this issue, this study proposes an edge-focused method composed of two components: the Edge Attention (EA) layer and the Edge Concatenation (EC) layer. The EA layer dynamically identifies channels that are highly responsive to edge features in the input feature map and applies learnable weights to emphasize them, enhancing the representation of boundary-related information, thereby emphasizing structurally significant boundaries. Subsequently, the EC layer extracts edge maps from the input image using the Sobel filter and concatenates them with the original feature maps along the channel dimension, allowing the model to explicitly incorporate edge information. To evaluate the effectiveness and compatibility of the proposed method, it was initially applied to a simple CNN architecture to isolate its impact. Subsequently, it was integrated into various widely used models, including DenseNet121, ResNet50, Vision Transformer (ViT), and a CAE-SVM-based document forgery detection model. Experiments were conducted on the DocTamper, Receipt, and MIDV-2020 datasets to assess classification accuracy and F1-score using both original and forged text images. Across all model architectures and datasets, the proposed EA–EC method consistently improved model performance, particularly by increasing sensitivity to asymmetric manipulations around text boundaries. These results demonstrate that the proposed edge-focused approach is not only effective but also highly adaptable, serving as a lightweight and modular extension that can be easily incorporated into existing deep learning-based document forgery detection frameworks. By reinforcing attention to structural inconsistencies often missed by standard convolutional networks, the proposed method provides a practical solution for enhancing the robustness and generalizability of forgery detection systems. Full article
Show Figures

Figure 1

Back to TopTop