Addressing Real-World Challenges in Recognition and Classification with Cutting-Edge AI Models and Methods

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 31 December 2025 | Viewed by 8183

Special Issue Editors

Department of Computer Science, Tulane University, New Orleans, LA 70118, USA
Interests: self-supervised learning; hybrid quantum–classical models; nature-inspired feature selection; positional encoding methods

E-Mail Website
Guest Editor
Department of Computer Science, Tulane University, New Orleans, LA 70118, USA
Interests: large language models; applications of LLMs; NLP; model optimizations; AI for code; big data analytics

Special Issue Information

Dear Colleagues,

Deep learning has emerged as a transformative technology, significantly enhancing the performance of classification and recognition tasks across various domains. The ability of deep neural networks to extract complex patterns from large datasets has led to unprecedented breakthroughs in fields such as computer vision, natural language processing, language translations, speech recognition, unstructured language processing, medical diagnostics, drug discovery and intelligent automation.

Despite these advancements, challenges such as model interpretability, computational efficiency, data scarcity, and real-time deployment on edge devices continue to affect research. Additionally, with the growing complexity of classification problems, especially in biomedical AI, drug discovery and smart agriculture, there is a need to explore beyond classical deep learning approaches; this is primarily driven by limitations such as multilingual and unstructured data, and high-dimensional data representations.

Emerging paradigms such as quantum-enhanced machine learning (QML) and large language models (LLMs) offer new opportunities for the integration of adaptive knowledge, the optimization of feature representations, the acceleration of training, the enhancement of inference efficiency in data-scarce environments, and generalization across healthcare, agriculture, commonsense reasoning, biomedical applications, and programming.

This Special Issue aims to compile innovative research that presents the practical application of deep learning models in classification and recognition, addressing both fundamental challenges and novel approaches that enhance accuracy, efficiency, and real-world usability. More specifically, we welcome contributions whose scope includes, but is not limited to, the following challenges facing AI models and methods:

  • Data Imbalance and Limited Annotations—Investigating novel techniques such as data augmentation, self-supervised learning, and active learning to address imbalanced and scarce datasets.
  • Domain Adaptation and Transfer Learning—Exploring methods to improve model generalization across different datasets, domains, and real-world environments.
  • Explainability and Interpretability in AI Models—Enhancing transparency in recognition and classification models for improved trust, fairness, and accountability.
  • Few-Shot, Zero-Shot, and Self-Supervised Learning—Advancing recognition and classification models that require minimal labelled data for improved adaptability.
  • Multimodal and Cross-Modal Learning—Integrating multiple data sources such as text, images, video, and sensor data to improve classification accuracy and robustness.
  • Real-Time and Scalable AI Solutions—Optimizing recognition and classification models for high-speed, large-scale processing in resource-constrained environments.
  • Bias Mitigation and Ethical AI—Addressing fairness, bias, and ethical concerns in AI-driven classification to ensure responsible and unbiased decision-making.
  • Application-Specific Challenges and Innovations—Addressing domain-specific recognition problems in healthcare, agriculture, business, social media, and education using cutting-edge AI techniques.

Dr. Syed Naqvi
Dr. Md Mostafizer Rahman
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • classification and recognition
  • large language models
  • hybrid quantum–classical approaches
  • multimodal and cross-modal learning
  • class imbalance
  • feature extraction and selection

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

28 pages, 3335 KB  
Article
MDFA-AconvNet: A Novel Multiscale Dilated Fusion Attention All-Convolution Network for SAR Target Classification
by Jiajia Wang, Jun Liu, Pin Zhang, Qi Jia, Xin Yang, Shenyu Du and Xueyu Bai
Information 2025, 16(11), 1007; https://doi.org/10.3390/info16111007 - 19 Nov 2025
Viewed by 348
Abstract
Synthetic aperture radar (SAR) features all-weather and all-day imaging capabilities, long-range detection, and high resolution, making it indispensable for battlefield reconnaissance, target detection, and guidance. In recent years, deep learning has emerged as a prominent approach for the classification of SAR image targets, [...] Read more.
Synthetic aperture radar (SAR) features all-weather and all-day imaging capabilities, long-range detection, and high resolution, making it indispensable for battlefield reconnaissance, target detection, and guidance. In recent years, deep learning has emerged as a prominent approach for the classification of SAR image targets, owing to its hierarchical feature extraction, progressive refinement, and end-to-end learning capabilities. However, challenges such as the high cost of SAR data acquisition and the limited number of labeled samples often result in overfitting and poor model generalization. In addition, conventional layers typically operate with fixed receptive fields, making it difficult to simultaneously capture multiscale contextual information and dynamically focus on salient target features. To address these limitations, this paper proposes a novel architecture: the Multiscale Dilated Fusion Attention All-Convolution Network (MDFA-AconvNet). The model incorporates a multiscale dilated attention mechanism that significantly broadens the receptive field across varying target scales in SAR images without compromising spatial resolution, thereby enhancing multiscale feature extraction. Furthermore, by introducing both channel attention and spatial attention mechanisms, the model is able to selectively emphasize informative feature channels and spatial regions relevant to target recognition. These attention modules are seamlessly integrated into the All-Convolution Network (A-convNet) backbone, resulting in comprehensive performance improvements. Extensive experiments on the MSTAR dataset demonstrate that the proposed MDFA-AconvNet achieves a high classification accuracy of 99.38% in ten target classes, markedly outperforming the original A-convNet algorithm. These compelling results highlight the model’s robustness against target variations and its significant potential for practical deployment, paving the way for more efficient SAR image classification and recognition systems. Full article
Show Figures

Graphical abstract

17 pages, 8015 KB  
Article
DFA-YOLO: A Novel YOLO Model for Electric Power Operation Violation Recognition
by Xiaoliang Qian, Xinyu Ding, Pengfei Wang, Jungang Guo, Hu Chen, Wei Wang and Peixu Xing
Information 2025, 16(11), 974; https://doi.org/10.3390/info16110974 - 11 Nov 2025
Viewed by 325
Abstract
The You Only Look Once (YOLO) series of models, particularly the recently introduced YOLOv12 model, have demonstrated significant potential in achieving accurate and rapid recognition of electric power operation violations, due to their comprehensive advantages in detection accuracy and real-time inference. However, the [...] Read more.
The You Only Look Once (YOLO) series of models, particularly the recently introduced YOLOv12 model, have demonstrated significant potential in achieving accurate and rapid recognition of electric power operation violations, due to their comprehensive advantages in detection accuracy and real-time inference. However, the current YOLO models still have three limitations: (1) the absence of a dedicated feature extraction for multi-scale objects, resulting in suboptimal detection capabilities for objects with varying sizes; (2) naive integration of spatial and channel attentions, which restricts the enhancement of feature discriminability and consequently impairs the detection performance for challenging objects in complex backgrounds; and (3) weak representation capability in low-level features, leading to insufficient accuracy for small-sized objects. To address these limitations, a novel YOLO model named DFA-YOLO is proposed, a real-time object detection model with YOLOv12n as its baseline, which makes three key contributions. Firstly, a dynamic weighted multi-scale convolution (DWMConv) module is proposed to address the first limitation, which employs lightweight multi-scale convolution followed by learnable weighted fusion to enhance feature representation for multi-scale objects. Secondly, a full-dimensional attention (FDA) module is proposed to address the second limitation, which gives a unified attention computation scheme that effectively integrates attention across height, width, and channel dimensions, thereby improving feature discriminability. Thirdly, a set of auxiliary detection heads (Aux-Heads) are introduced to address the third limitation and inserted into the backbone network to strengthen the training effect of labels on the low-level feature extraction module. The ablation studies on the EPOVR-v1.0 dataset demonstrate the validity of the proposed DWMConv module, FDA module, Aux-Heads, and their synergistic integration. Relative to the baseline model, DFA-YOLO achieves significant improvements in mAP@0.5 and mAP@0.5–0.95, by 3.15% and 4.13%, respectively, meanwhile reducing parameters and GFLOPS by 0.06M and 0.06, respectively, and increasing FPS by 3.52. Comprehensive quantitative comparisons with nine official YOLO models, including YOLOv13n, confirm that DFA-YOLO achieves superior performance in both detection precision and real-time inference, further validating the effectiveness of the DFA-YOLO model. Full article
Show Figures

Figure 1

20 pages, 7048 KB  
Article
Enhanced Lightweight Object Detection Model in Complex Scenes: An Improved YOLOv8n Approach
by Sohaya El Hamdouni, Boutaina Hdioud and Sanaa El Fkihi
Information 2025, 16(10), 871; https://doi.org/10.3390/info16100871 - 8 Oct 2025
Viewed by 1024
Abstract
Object detection has a vital impact on the analysis and interpretation of visual scenes. It is widely utilized in various fields, including healthcare, autonomous driving, and vehicle surveillance. However, complex scenes containing small, occluded, and multiscale objects present significant difficulties for object detection. [...] Read more.
Object detection has a vital impact on the analysis and interpretation of visual scenes. It is widely utilized in various fields, including healthcare, autonomous driving, and vehicle surveillance. However, complex scenes containing small, occluded, and multiscale objects present significant difficulties for object detection. This paper introduces a lightweight object detection algorithm, utilizing YOLOv8n as the baseline model, to address these problems. Our method focuses on four steps. Firstly, we add a layer for small object detection to enhance the feature expression capability of small objects. Secondly, to handle complex forms and appearances, we employ the C2f-DCNv2 module. This module integrates advanced DCNv2 (Deformable Convolutional Networks v2) by substituting the final C2f module in the backbone. Thirdly, we designed the CBAM, a lightweight attention module. We integrate it into the neck section to address missed detections. Finally, we use Ghost Convolution (GhostConv) as a light convolutional layer. This alternates with ordinary convolution in the neck. It ensures good detection performance while decreasing the number of parameters. Experimental performance on the PASCAL VOC dataset demonstrates that our approach lowers the number of model parameters by approximately 9.37%. The mAP@0.5:0.95 increased by 0.9%, recall (R) increased by 0.8%, mAP@0.5 increased by 0.3%, and precision (P) increased by 0.1% compared to the baseline model. To better evaluate the model’s generalization performance in real-world driving scenarios, we conducted additional experiments using the KITTI dataset. Compared to the baseline model, our approach yielded a 0.8% improvement in mAP@0.5 and 1.3% in mAP@0.5:0.95. This result indicates strong performance in more dynamic and challenging conditions. Full article
Show Figures

Graphical abstract

26 pages, 2107 KB  
Article
TSRACE-AI: Traffic Sign Recognition Accelerated with Co-Designed Edge AI Based on Hybrid FPGA Architecture for ADAS
by Abderrahmane Smaali, Said Ben Alla and Abdellah Touhafi
Information 2025, 16(8), 703; https://doi.org/10.3390/info16080703 - 18 Aug 2025
Viewed by 890
Abstract
The need for efficient and real-time traffic sign recognition has become increasingly important as autonomous vehicles and Advanced Driver Assistance Systems (ADASs) continue to evolve. This study introduces TSRACE-AI, a system that accelerates traffic sign recognition by combining hardware and software in a [...] Read more.
The need for efficient and real-time traffic sign recognition has become increasingly important as autonomous vehicles and Advanced Driver Assistance Systems (ADASs) continue to evolve. This study introduces TSRACE-AI, a system that accelerates traffic sign recognition by combining hardware and software in a hybrid architecture deployed on the PYNQ-Z2 FPGA platform. The design employs the Deep Learning Processing Unit (DPU) for hardware acceleration and incorporates 8-bit fixed-point quantization to enhance the performance of the CNN model. The proposed system achieves a 98.85% reduction in latency and a 200.28% increase in throughput compared to similar works, with a trade-off of a 90.35% decrease in power efficiency. Despite this trade-off, the system excels in latency-sensitive applications, demonstrating its suitability for real-time decision-making. By balancing speed and power efficiency, TSRACE-AI offers a compelling solution for integrating traffic sign recognition into ADAS, paving the way for enhanced autonomous driving capabilities. Full article
Show Figures

Figure 1

20 pages, 6748 KB  
Article
YOLO-SSFA: A Lightweight Real-Time Infrared Detection Method for Small Targets
by Yuchi Wang, Minghua Cao, Qing Yang, Yue Zhang and Zexuan Wang
Information 2025, 16(7), 618; https://doi.org/10.3390/info16070618 - 20 Jul 2025
Cited by 1 | Viewed by 1341
Abstract
Infrared small target detection is crucial for military surveillance and autonomous driving. However, complex scenes and weak signal characteristics make the identification of such targets particularly difficult. This study proposes YOLO-SSFA, an enhanced You Only Look Once version 11 (YOLOv11) model with three [...] Read more.
Infrared small target detection is crucial for military surveillance and autonomous driving. However, complex scenes and weak signal characteristics make the identification of such targets particularly difficult. This study proposes YOLO-SSFA, an enhanced You Only Look Once version 11 (YOLOv11) model with three modules: Scale-Sequence Feature Fusion (SSFF), LiteShiftHead detection head, and Noise Suppression Network (NSN). SSFF improves multi-scale feature representation through adaptive fusion; LiteShiftHead boosts efficiency via sparse convolution and dynamic integration; and NSN enhances localization accuracy by focusing on key regions. Experiments on the HIT-UAV and FLIR datasets show mAP50 scores of 94.9% and 85%, respectively. These findings showcase YOLO-SSFA’s strong potential for real-time deployment in challenging infrared environments. Full article
Show Figures

Figure 1

20 pages, 3265 KB  
Article
Enhancing Rare Class Performance in HOI Detection with Re-Splitting and a Fair Test Dataset
by Gyubin Park and Afaque Manzoor Soomro
Information 2025, 16(6), 474; https://doi.org/10.3390/info16060474 - 6 Jun 2025
Viewed by 1105
Abstract
In Human–Object Interaction (HOI) detection, class imbalance severely limits the performance of a model on infrequent interaction categories. To overcome this problem, a Re-Splitting algorithm has been developed. This algorithm implements DreamSim-based clustering and performs k-means-based partitioning to restructure the train–test splits. By [...] Read more.
In Human–Object Interaction (HOI) detection, class imbalance severely limits the performance of a model on infrequent interaction categories. To overcome this problem, a Re-Splitting algorithm has been developed. This algorithm implements DreamSim-based clustering and performs k-means-based partitioning to restructure the train–test splits. By doing so, the approach balances the rarities and frequent classes of interaction equally, thereby increasing robustness. A Real-World test dataset has also been introduced. This dataset is comparable to a truly independent benchmark. It is designed to address class distribution bias, which is commonly present in traditional test sets. However, as shown in the Experiment and Evaluation subsection, a high level of performance can be achieved for the general case using different few-shot and rare-class training instances. Models trained solely on the re-split dataset show significant improvements in rare-class mAP, particularly for one-stage models. Evaluation on the test dataset from the real world further emphasizes previously overlooked model performance and supports fair structuring of dataset. The methods are validated with extensive experiments using five one-stage and two two-stage models. Our analysis shows that reshaping dataset distributions increases rare-class detection by as much as 8.0 mAP. This study paves the way for balanced training and evaluation leading to the formulation of a general framework for scalable, fair, and generalizable HOI detection. Full article
Show Figures

Figure 1

22 pages, 2695 KB  
Article
Comparing Classification Algorithms to Recognize Selected Gestures Based on Microsoft Azure Kinect Joint Data
by Marc Funken and Thomas Hanne
Information 2025, 16(5), 421; https://doi.org/10.3390/info16050421 - 21 May 2025
Cited by 1 | Viewed by 1048
Abstract
This study aims to explore the potential of exergaming (which can be used along with prescriptive medication for children with spinal muscular atrophy) and examine its effects on monitoring and diagnosis. The present study focuses on comparing models trained on joint data for [...] Read more.
This study aims to explore the potential of exergaming (which can be used along with prescriptive medication for children with spinal muscular atrophy) and examine its effects on monitoring and diagnosis. The present study focuses on comparing models trained on joint data for gesture detection, which has not been extensively explored in previous studies. The study investigates three approaches to detect gestures based on 3D Microsoft Azure Kinect joint data. We discuss simple decision rules based on angles and distances to label gestures. In addition, we explore supervised learning methods to increase the accuracy of gesture recognition in gamification. The compared models performed well on the recorded sample data, with the recurrent neural networks outperforming feedforward neural networks and decision trees on the captured motions. The findings suggest that gesture recognition based on joint data can be a valuable tool for monitoring and diagnosing children with spinal muscular atrophy. This study contributes to the growing body of research on the potential of virtual solutions in rehabilitation. The results also highlight the importance of using joint data for gesture recognition and provide insights into the most effective models for this task. The findings of this study can inform the development of more accurate and effective monitoring and diagnostic tools for children with spinal muscular atrophy. Full article
Show Figures

Figure 1

21 pages, 3195 KB  
Article
YOLO-LSM: A Lightweight UAV Target Detection Algorithm Based on Shallow and Multiscale Information Learning
by Chenxing Wu, Changlong Cai, Feng Xiao, Jiahao Wang, Yulin Guo and Longhui Ma
Information 2025, 16(5), 393; https://doi.org/10.3390/info16050393 - 9 May 2025
Cited by 3 | Viewed by 1443
Abstract
To address challenges such as large-scale variations, high density of small targets, and the large number of parameters in deep learning-based target detection models, which limit their deployment on UAV platforms with fixed performance and limited computational resources, a lightweight UAV target detection [...] Read more.
To address challenges such as large-scale variations, high density of small targets, and the large number of parameters in deep learning-based target detection models, which limit their deployment on UAV platforms with fixed performance and limited computational resources, a lightweight UAV target detection algorithm, YOLO-LSM, is proposed. First, to mitigate the loss of small target information, an Efficient Small Target Detection Layer (ESTDL) is developed, alongside structural improvements to the baseline model to reduce parameters. Second, a Multiscale Lightweight Convolution (MLConv) is designed, and a lightweight feature extraction module, MLCSP, is constructed to enhance the extraction of detailed information. Focaler inner IoU is incorporated to improve bounding box matching and localization, thereby accelerating model convergence. Finally, a novel feature fusion network, DFSPP, is proposed to enhance accuracy by optimizing the selection and adjustment of target scale ranges. Validations on the VisDrone2019 and Tiny Person datasets demonstrate that compared to the benchmark network, the YOLO-LSM achieves a mAP0.5 improvement of 6.9 and 3.5 percentage points, respectively, with a parameter count of 1.9 M, representing a reduction of approximately 72%. Different from previous work on medical detection, this study tailors YOLO-LSM for UAV-based small object detection by introducing targeted improvements in feature extraction, detection heads, and loss functions, achieving better adaptation to aerial scenarios. Full article
Show Figures

Figure 1

Back to TopTop