MDPI - Publisher of Open Access Journals

23 pages, 3055 KiB

Open AccessArticle

RDPNet: A Multi-Scale Residual Dilated Pyramid Network with Entropy-Based Feature Fusion for Epileptic EEG Classification

by Tongle Xie, Wei Zhao, Yanyouyou Liu and Shixiao Xiao

Entropy 2025, 27(8), 830; https://doi.org/10.3390/e27080830 (registering DOI) - 5 Aug 2025

Abstract

Epilepsy is a prevalent neurological disorder affecting approximately 50 million individuals worldwide. Electroencephalogram (EEG) signals play a vital role in the diagnosis and analysis of epileptic seizures. However, traditional machine learning techniques often rely on handcrafted features, limiting their robustness and generalizability across [...] Read more.

Epilepsy is a prevalent neurological disorder affecting approximately 50 million individuals worldwide. Electroencephalogram (EEG) signals play a vital role in the diagnosis and analysis of epileptic seizures. However, traditional machine learning techniques often rely on handcrafted features, limiting their robustness and generalizability across diverse EEG acquisition settings, seizure types, and patients. To address these limitations, we propose RDPNet, a multi-scale residual dilated pyramid network with entropy-guided feature fusion for automated epileptic EEG classification. RDPNet combines residual convolution modules to extract local features and a dilated convolutional pyramid to capture long-range temporal dependencies. A dual-pathway fusion strategy integrates pooled and entropy-based features from both shallow and deep branches, enabling robust representation of spatial saliency and statistical complexity. We evaluate RDPNet on two benchmark datasets: the University of Bonn and TUSZ. On the Bonn dataset, RDPNet achieves 99.56–100% accuracy in binary classification, 99.29–99.79% in ternary tasks, and 95.10% in five-class classification. On the clinically realistic TUSZ dataset, it reaches a weighted F₁-score of 95.72% across seven seizure types. Compared with several baselines, RDPNet consistently outperforms existing approaches, demonstrating superior robustness, generalizability, and clinical potential for epileptic EEG analysis. Full article

(This article belongs to the Special Issue Complexity, Entropy and the Physics of Information II)

► Show Figures

Figure 1

36 pages, 25361 KiB

Open AccessArticle

Remote Sensing Image Compression via Wavelet-Guided Local Structure Decoupling and Channel–Spatial State Modeling

by Jiahui Liu, Lili Zhang and Xianjun Wang

Remote Sens. 2025, 17(14), 2419; https://doi.org/10.3390/rs17142419 - 12 Jul 2025

Viewed by 467

Abstract

As the resolution and data volume of remote sensing imagery continue to grow, achieving efficient compression without sacrificing reconstruction quality remains a major challenge, given that traditional handcrafted codecs often fail to balance rate-distortion performance and computational complexity, while deep learning-based approaches offer [...] Read more.

As the resolution and data volume of remote sensing imagery continue to grow, achieving efficient compression without sacrificing reconstruction quality remains a major challenge, given that traditional handcrafted codecs often fail to balance rate-distortion performance and computational complexity, while deep learning-based approaches offer superior representational capacity. However, challenges remain in achieving a balance between fine-detail adaptation and computational efficiency. Mamba, a state–space model (SSM)-based architecture, offers linear-time complexity and excels at capturing long-range dependencies in sequences. It has been adopted in remote sensing compression tasks to model long-distance dependencies between pixels. However, despite its effectiveness in global context aggregation, Mamba’s uniform bidirectional scanning is insufficient for capturing high-frequency structures such as edges and textures. Moreover, existing visual state–space (VSS) models built upon Mamba typically treat all channels equally and lack mechanisms to dynamically focus on semantically salient spatial regions. To address these issues, we present an innovative architecture for distant sensing image compression, called the Multi-scale Channel Global Mamba Network (MGMNet). MGMNet integrates a spatial–channel dynamic weighting mechanism into the Mamba architecture, enhancing global semantic modeling while selectively emphasizing informative features. It comprises two key modules. The Wavelet Transform-guided Local Structure Decoupling (WTLS) module applies multi-scale wavelet decomposition to disentangle and separately encode low- and high-frequency components, enabling efficient parallel modeling of global contours and local textures. The Channel–Global Information Modeling (CGIM) module enhances conventional VSS by introducing a dual-path attention strategy that reweights spatial and channel information, improving the modeling of long-range dependencies and edge structures. We conducted extensive evaluations on three distinct remote sensing datasets to assess the MGMNet. The results of the investigations revealed that MGMNet outperforms the current SOTA models across various performance metrics. Full article

(This article belongs to the Special Issue New Insights in Remote Sensing Image Interpretation with Deep Learning)

► Show Figures

Figure 1

22 pages, 6123 KiB

Open AccessArticle

Real-Time Proprioceptive Sensing Enhanced Switching Model Predictive Control for Quadruped Robot Under Uncertain Environment

by Sanket Lokhande, Yajie Bao, Peng Cheng, Dan Shen, Genshe Chen and Hao Xu

Electronics 2025, 14(13), 2681; https://doi.org/10.3390/electronics14132681 - 2 Jul 2025

Viewed by 503

Abstract

Quadruped robots have shown significant potential in disaster relief applications, where they have to navigate complex terrains for search and rescue or reconnaissance operations. However, their deployment is hindered by limited adaptability in highly uncertain environments, especially when relying solely on vision-based sensors [...] Read more.

Quadruped robots have shown significant potential in disaster relief applications, where they have to navigate complex terrains for search and rescue or reconnaissance operations. However, their deployment is hindered by limited adaptability in highly uncertain environments, especially when relying solely on vision-based sensors like cameras or LiDAR, which are susceptible to occlusions, poor lighting, and environmental interference. To address these limitations, this paper proposes a novel sensor-enhanced hierarchical switching model predictive control (MPC) framework that integrates proprioceptive sensing with a bi-level hybrid dynamic model. Unlike existing methods that either rely on handcrafted controllers or deep learning-based control pipelines, our approach introduces three core innovations: (1) a situation-aware, bi-level hybrid dynamic modeling strategy that hierarchically combines single-body rigid dynamics with distributed multi-body dynamics for modeling agility and scalability; (2) a three-layer hybrid control framework, including a terrain-aware switching MPC layer, a distributed torque controller, and a fast PD control loop for enhanced robustness during contact transitions; and (3) a multi-IMU-based proprioceptive feedback mechanism for terrain classification and adaptive gait control under sensor-occluded or GPS-denied environments. Together, these components form a unified and computationally efficient control scheme that addresses practical challenges such as limited onboard processing, unstructured terrain, and environmental uncertainty. A series of experimental results demonstrate that the proposed method outperforms existing vision- and learning-based controllers in terms of stability, adaptability, and control efficiency during high-speed locomotion over irregular terrain. Full article

(This article belongs to the Special Issue Smart Robotics and Autonomous Systems)

► Show Figures

Figure 1

73 pages, 2833 KiB

Open AccessArticle

A Comprehensive Methodological Survey of Human Activity Recognition Across Diverse Data Modalities

by Jungpil Shin, Najmul Hassan, Abu Saleh Musa Miah and Satoshi Nishimura

Sensors 2025, 25(13), 4028; https://doi.org/10.3390/s25134028 - 27 Jun 2025

Cited by 1 | Viewed by 1454

Abstract

Human Activity Recognition (HAR) systems aim to understand human behavior and assign a label to each action, attracting significant attention in computer vision due to their wide range of applications. HAR can leverage various data modalities, such as RGB images and video, skeleton, [...] Read more.

Human Activity Recognition (HAR) systems aim to understand human behavior and assign a label to each action, attracting significant attention in computer vision due to their wide range of applications. HAR can leverage various data modalities, such as RGB images and video, skeleton, depth, infrared, point cloud, event stream, audio, acceleration, and radar signals. Each modality provides unique and complementary information suited to different application scenarios. Consequently, numerous studies have investigated diverse approaches for HAR using these modalities. This survey includes only peer-reviewed research papers published in English to ensure linguistic consistency and academic integrity. This paper presents a comprehensive survey of the latest advancements in HAR from 2014 to 2025, focusing on Machine Learning (ML) and Deep Learning (DL) approaches categorized by input data modalities. We review both single-modality and multi-modality techniques, highlighting fusion-based and co-learning frameworks. Additionally, we cover advancements in hand-crafted action features, methods for recognizing human–object interactions, and activity detection. Our survey includes a detailed dataset description for each modality, as well as a summary of the latest HAR systems, accompanied by a mathematical derivation for evaluating the deep learning model for each modality, and it also provides comparative results on benchmark datasets. Finally, we provide insightful observations and propose effective future research directions in HAR. Full article

(This article belongs to the Special Issue Computer Vision and Sensors-Based Application for Intelligent Systems)

► Show Figures

Figure 1

30 pages, 2018 KiB

Open AccessArticle

Comprehensive Performance Comparison of Signal Processing Features in Machine Learning Classification of Alcohol Intoxication on Small Gait Datasets

by Muxi Qi, Samuel Chibuoyim Uche and Emmanuel Agu

Appl. Sci. 2025, 15(13), 7250; https://doi.org/10.3390/app15137250 - 27 Jun 2025

Viewed by 383

Abstract

Detecting alcohol intoxication is crucial for preventing accidents and enhancing public safety. Traditional intoxication detection methods rely on direct blood alcohol concentration (BAC) measurement via breathalyzers and wearable sensors. These methods require the user to purchase and carry external hardware such as breathalyzers, [...] Read more.

Detecting alcohol intoxication is crucial for preventing accidents and enhancing public safety. Traditional intoxication detection methods rely on direct blood alcohol concentration (BAC) measurement via breathalyzers and wearable sensors. These methods require the user to purchase and carry external hardware such as breathalyzers, which is expensive and cumbersome. Convenient, unobtrusive intoxication detection methods using equipment already owned by users are desirable. Recent research has explored machine learning-based approaches using smartphone accelerometers to classify intoxicated gait patterns. While neural network approaches have emerged, due to the significant challenges with collecting intoxicated gait data, gait datasets are often too small to utilize such approaches. To avoid overfitting on such small datasets, traditional machine learning (ML) classification is preferred. A comprehensive set of ML features have been proposed. However, until now, no work has systematically evaluated the performance of various categories of gait features for alcohol intoxication detection task using traditional machine learning algorithms. This study evaluates 27 signal processing features handcrafted from accelerometer gait data across five domains: time, frequency, wavelet, statistical, and information-theoretic. The data were collected from 24 subjects who experienced alcohol stimulation using goggle busters. Correlation-based feature selection (CFS) was employed to rank the features most correlated with alcohol-induced gait changes, revealing that 22 features exhibited statistically significant correlations with BAC levels. These statistically significant features were utilized to train supervised classifiers and assess their impact on alcohol intoxication detection accuracy. Statistical features yielded the highest accuracy (83.89%), followed by time-domain (83.22%) and frequency-domain features (82.21%). Classifying all domain 22 significant features using a random forest model improved classification accuracy to 84.9%. These findings suggest that incorporating a broader set of signal processing features enhances the accuracy of smartphone-based alcohol intoxication detection. Full article

(This article belongs to the Special Issue AI-Based Biomedical Signal and Image Processing)

► Show Figures

Figure 1

28 pages, 11832 KiB

Open AccessArticle

On the Minimum Dataset Requirements for Fine-Tuning an Object Detector for Arable Crop Plant Counting: A Case Study on Maize Seedlings

by Samuele Bumbaca and Enrico Borgogno-Mondino

Remote Sens. 2025, 17(13), 2190; https://doi.org/10.3390/rs17132190 - 25 Jun 2025

Viewed by 406

Abstract

Object detection is essential for precision agriculture applications like automated plant counting, but the minimum dataset requirements for effective model deployment remain poorly understood for arable crop seedling detection on orthomosaics. This study investigated how much annotated data is required to achieve standard [...] Read more.

Object detection is essential for precision agriculture applications like automated plant counting, but the minimum dataset requirements for effective model deployment remain poorly understood for arable crop seedling detection on orthomosaics. This study investigated how much annotated data is required to achieve standard counting accuracy (R² = 0.85) for maize seedlings across different object detection approaches. We systematically evaluated traditional deep learning models requiring many training examples (YOLOv5, YOLOv8, YOLO11, RT-DETR), newer approaches requiring few examples (CD-ViTO), and methods requiring zero labeled examples (OWLv2) using drone-captured orthomosaic RGB imagery. We also implemented a handcrafted computer graphics algorithm as baseline. Models were tested with varying training sources (in-domain vs. out-of-distribution data), training dataset sizes (10–150 images), and annotation quality levels (10–100%). Our results demonstrate that no model trained on out-of-distribution data achieved acceptable performance, regardless of dataset size. In contrast, models trained on in-domain data reached the benchmark with as few as 60–130 annotated images, depending on architecture. Transformer-based models (RT-DETR) required significantly fewer samples (60) than CNN-based models (110–130), though they showed different tolerances to annotation quality reduction. Models maintained acceptable performance with only 65–90% of original annotation quality. Despite recent advances, neither few-shot nor zero-shot approaches met minimum performance requirements for precision agriculture deployment. These findings provide practical guidance for developing maize seedling detection systems, demonstrating that successful deployment requires in-domain training data, with minimum dataset requirements varying by model architecture. Full article

(This article belongs to the Special Issue Remote Sensing and Associated Artificial Intelligence in Agricultural Applications (2nd Edition))

► Show Figures

Figure 1

27 pages, 2049 KiB

Open AccessArticle

Optimizing Tumor Detection in Brain MRI with One-Class SVM and Convolutional Neural Network-Based Feature Extraction

by Azeddine Mjahad and Alfredo Rosado-Muñoz

J. Imaging 2025, 11(7), 207; https://doi.org/10.3390/jimaging11070207 - 21 Jun 2025

Viewed by 472

Abstract

The early detection of brain tumors is critical for improving clinical outcomes and patient survival. However, medical imaging datasets frequently exhibit class imbalance, posing significant challenges for traditional classification algorithms that rely on balanced data distributions. To address this issue, this study employs [...] Read more.

The early detection of brain tumors is critical for improving clinical outcomes and patient survival. However, medical imaging datasets frequently exhibit class imbalance, posing significant challenges for traditional classification algorithms that rely on balanced data distributions. To address this issue, this study employs a One-Class Support Vector Machine (OCSVM) trained exclusively on features extracted from healthy brain MRI images, using both deep learning architectures—such as DenseNet121, VGG16, MobileNetV2, InceptionV3, and ResNet50—and classical feature extraction techniques. Experimental results demonstrate that combining Convolutional Neural Network (CNN)-based feature extraction with OCSVM significantly improves anomaly detection performance compared with simpler handcrafted approaches. DenseNet121 achieved an accuracy of 94.83%, a precision of 99.23%, and a sensitivity of 89.97%, while VGG16 reached an accuracy of 95.33%, a precision of 98.87%, and a sensitivity of 91.32%. MobileNetV2 showed a competitive trade-off between accuracy (92.83%) and computational efficiency, making it suitable for resource-constrained environments. Additionally, the pure CNN model—trained directly for classification without OCSVM—outperformed hybrid methods with an accuracy of 97.83%, highlighting the effectiveness of deep convolutional networks in directly learning discriminative features from MRI data. This approach enables reliable detection of brain tumor anomalies without requiring labeled pathological data, offering a promising solution for clinical contexts where abnormal samples are scarce. Future research will focus on reducing inference time, expanding and diversifying training datasets, and incorporating explainability tools to support clinical integration and trust in AI-based diagnostics. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

22 pages, 8644 KiB

Open AccessArticle

Privacy-Preserving Approach for Early Detection of Long-Lie Incidents: A Pilot Study with Healthy Subjects

by Riska Analia, Anne Forster, Sheng-Quan Xie and Zhiqiang Zhang

Sensors 2025, 25(12), 3836; https://doi.org/10.3390/s25123836 - 19 Jun 2025

Viewed by 650

Abstract

(1) Background: Detecting long-lie incidents—where individuals remain immobile after a fall—is essential for timely intervention and preventing severe health consequences. However, most existing systems focus only on fall detection, neglect post-fall monitoring, and raise privacy concerns, especially in real-time, non-invasive applications; (2) Methods: [...] Read more.

(1) Background: Detecting long-lie incidents—where individuals remain immobile after a fall—is essential for timely intervention and preventing severe health consequences. However, most existing systems focus only on fall detection, neglect post-fall monitoring, and raise privacy concerns, especially in real-time, non-invasive applications; (2) Methods: This study proposes a lightweight, privacy-preserving, long-lie detection system utilizing thermal imaging and a soft-voting ensemble classifier. A low-resolution thermal camera captured simulated falls and activities of daily living (ADL) performed by ten healthy participants. Human pose keypoints were extracted using MediaPipe, followed by the computation of five handcrafted postural features. The top three classifiers—automatically selected based on cross-validation performance—formed the soft-voting ensemble. Long-lie conditions were identified through post-fall immobility monitoring over a defined period, using rule-based logic on posture stability and duration; (3) Results: The ensemble model achieved high classification performance with accuracy, precision, recall, and an F1 score of 0.98. Real-time deployment on a Raspberry Pi 5 demonstrated the system is capable of accurately detecting long-lie incidents based on continuous monitoring over 15 min, with minimal posture variation; (4) Conclusion: The proposed system introduces a novel approach to long-lie detection by integrating privacy-aware sensing, interpretable posture-based features, and efficient edge computing. It demonstrates strong potential for deployment in homecare settings. Future work includes validation with older adults and integration of vital sign monitoring for comprehensive assessment. Full article

(This article belongs to the Special Issue Sensing and Signal Processing Technologies for Outpatient Monitoring and Rehabilitation)

► Show Figures

Figure 1

28 pages, 4483 KiB

Open AccessArticle

Historical Manuscripts Analysis: A Deep Learning System for Writer Identification Using Intelligent Feature Selection with Vision Transformers

by Merouane Boudraa, Akram Bennour, Mouaaz Nahas, Rashiq Rafiq Marie and Mohammed Al-Sarem

J. Imaging 2025, 11(6), 204; https://doi.org/10.3390/jimaging11060204 - 19 Jun 2025

Viewed by 706

Abstract

Identifying the scriptwriter in historical manuscripts is crucial for historians, providing valuable insights into historical contexts and aiding in solving historical mysteries. This research presents a robust deep learning system designed for classifying historical manuscripts by writer, employing intelligent feature selection and vision [...] Read more.

Identifying the scriptwriter in historical manuscripts is crucial for historians, providing valuable insights into historical contexts and aiding in solving historical mysteries. This research presents a robust deep learning system designed for classifying historical manuscripts by writer, employing intelligent feature selection and vision transformers. Our methodology meticulously investigates the efficacy of both handcrafted techniques for feature identification and deep learning architectures for classification tasks in writer identification. The initial preprocessing phase involves thorough document refinement using bilateral filtering for denoising and Otsu thresholding for binarization, ensuring document clarity and consistency for subsequent feature detection. We utilize the FAST detector for feature detection, extracting keypoints representing handwriting styles, followed by clustering with the k-means algorithm to obtain meaningful patches of uniform size. This strategic clustering minimizes redundancy and creates a comprehensive dataset ideal for deep learning classification tasks. Leveraging vision transformer models, our methodology effectively learns complex patterns and features from extracted patches, enabling precise identification of writers across historical manuscripts. This study pioneers the application of vision transformers in historical document analysis, showcasing superior performance on the “ICDAR 2017” dataset compared to state-of-the-art methods and affirming our approach as a robust tool for historical manuscript analysis. Full article

(This article belongs to the Section Document Analysis and Processing)

► Show Figures

Figure 1

23 pages, 4949 KiB

Open AccessArticle

Hybrid LDA-CNN Framework for Robust End-to-End Myoelectric Hand Gesture Recognition Under Dynamic Conditions

by Hongquan Le, Marc in het Panhuis, Geoffrey M. Spinks and Gursel Alici

Robotics 2025, 14(6), 83; https://doi.org/10.3390/robotics14060083 - 17 Jun 2025

Viewed by 871

Abstract

Gesture recognition based on conventional machine learning is the main control approach for advanced prosthetic hand systems. Its primary limitation is the need for feature extraction, which must meet real-time control requirements. On the other hand, deep learning models could potentially overfit when [...] Read more.

Gesture recognition based on conventional machine learning is the main control approach for advanced prosthetic hand systems. Its primary limitation is the need for feature extraction, which must meet real-time control requirements. On the other hand, deep learning models could potentially overfit when trained on small datasets. For these reasons, we propose a hybrid Linear Discriminant Analysis–convolutional neural network (LDA-CNN) framework to improve the gesture recognition performance of sEMG-based prosthetic hand control systems. Within this framework, 1D-CNN filters are trained to generate latent representation that closely approximates Fisher’s (LDA’s) discriminant subspace, constructed from handcrafted features. Under the train-one-test-all evaluation scheme, our proposed hybrid framework consistently outperformed the 1D-CNN trained with cross-entropy loss only, showing improvements from 4% to 11% across two public datasets featuring hand gestures recorded under various limb positions and arm muscle contraction levels. Furthermore, our framework exhibited advantages in terms of induced spectral regularization, which led to a state-of-the-art recognition error of 22.79% with the extended 23 feature set when tested on the multi-limb position dataset. The main novelty of our hybrid framework is that it decouples feature extraction in regard to the inference time, enabling the future incorporation of a more extensive set of features, while keeping the inference computation time minimal. Full article

(This article belongs to the Special Issue AI for Robotic Exoskeletons and Prostheses)

► Show Figures

Figure 1

22 pages, 1961 KiB

Open AccessArticle

Incorporating Implicit and Explicit Feature Fusion into Hybrid Recommendation for Improved Rating Prediction

by Qinglong Li, Euiju Jeong, Seok-Kee Lee and Jiaen Li

Electronics 2025, 14(12), 2384; https://doi.org/10.3390/electronics14122384 - 11 Jun 2025

Viewed by 395

Abstract

Online review texts serve as a valuable source of auxiliary information for addressing the data sparsity problem in recommender systems. These reviews often reflect user preferences across multiple item attributes and can be effectively incorporated into recommendation models to enhance both the accuracy [...] Read more.

Online review texts serve as a valuable source of auxiliary information for addressing the data sparsity problem in recommender systems. These reviews often reflect user preferences across multiple item attributes and can be effectively incorporated into recommendation models to enhance both the accuracy and interpretability of recommendations. Review-based recommendation approaches can be broadly classified into implicit and explicit methods. Implicit methods leverage deep learning techniques to extract latent semantic representations from review texts but generally lack interpretability due to limited transparency in the training process. In contrast, explicit methods rely on hand-crafted features derived from domain knowledge, which offer high explanatory capability but typically capture only shallow information. Integrating the complementary strengths of these two approaches presents a promising direction for improving recommendation performance. However, previous research exploring this integration remains limited. In this study, we propose a novel recommendation model that jointly considers implicit and explicit representations derived from review texts. To this end, we incorporate a self-attention mechanism to emphasize important features from each representation type and utilize Bidirectional Encoder Representations from Transformers (BERT) to capture rich contextual information embedded in the reviews. We evaluate the performance of the proposed model through extensive experiments using three real-world datasets. The experimental results demonstrate that our model outperforms several baseline models, confirming its effectiveness in generating accurate and explainable recommendations. Full article

(This article belongs to the Special Issue AI and Machine Learning in Recommender Systems and Customer Behavior)

► Show Figures

Figure 1

19 pages, 840 KiB

Open AccessArticle

A Dual-Feature Framework for Enhanced Diagnosis of Myeloproliferative Neoplasm Subtypes Using Artificial Intelligence

by Amna Bamaqa, N. S. Labeeb, Eman M. El-Gendy, Hani M. Ibrahim, Mohamed Farsi, Hossam Magdy Balaha, Mahmoud Badawy and Mostafa A. Elhosseini

Bioengineering 2025, 12(6), 623; https://doi.org/10.3390/bioengineering12060623 - 7 Jun 2025

Viewed by 688

Abstract

Myeloproliferative neoplasms, particularly the Philadelphia chromosome-negative (Ph-negative) subtypes such as essential thrombocythemia, polycythemia vera, and primary myelofibrosis, present diagnostic challenges due to overlapping morphological features and clinical heterogeneity. Traditional diagnostic approaches, including imaging and histopathological analysis, are often limited by interobserver variability, delayed [...] Read more.

Myeloproliferative neoplasms, particularly the Philadelphia chromosome-negative (Ph-negative) subtypes such as essential thrombocythemia, polycythemia vera, and primary myelofibrosis, present diagnostic challenges due to overlapping morphological features and clinical heterogeneity. Traditional diagnostic approaches, including imaging and histopathological analysis, are often limited by interobserver variability, delayed diagnosis, and subjective interpretations. To address these limitations, we propose a novel framework that integrates handcrafted and automatic feature extraction techniques for improved classification of Ph-negative myeloproliferative neoplasms. Handcrafted features capture interpretable morphological and textural characteristics. In contrast, automatic features utilize deep learning models to identify complex patterns in histopathological images. The extracted features were used to train machine learning models, with hyperparameter optimization performed using Optuna. Our framework achieved high performance across multiple metrics, including precision, recall, F1 score, accuracy, specificity, and weighted average. The concatenated probabilities, which combine both feature types, demonstrated the highest mean weighted average of 0.9969, surpassing the individual performances of handcrafted (0.9765) and embedded features (0.9686). Statistical analysis confirmed the robustness and reliability of the results. However, challenges remain in assuming normal distributions for certain feature types. This study highlights the potential of combining domain-specific knowledge with data-driven approaches to enhance diagnostic accuracy and support clinical decision-making. Full article

(This article belongs to the Special Issue Deciphering Medicine: The Role of Explainable Artificial Intelligence in Healthcare Innovations, 2nd Edition)

► Show Figures

Figure 1

24 pages, 822 KiB

Open AccessArticle

Survey on Image-Based Vehicle Detection Methods

by Mortda A. A. Adam and Jules R. Tapamo

World Electr. Veh. J. 2025, 16(6), 303; https://doi.org/10.3390/wevj16060303 - 29 May 2025

Viewed by 838

Abstract

Vehicle detection is essential for real-world applications such as road surveillance, intelligent transportation systems, and autonomous driving, where high accuracy and real-time performance are critical. However, achieving robust detection remains challenging due to scene complexity, occlusion, scale variation, and varying lighting conditions. Over [...] Read more.

Vehicle detection is essential for real-world applications such as road surveillance, intelligent transportation systems, and autonomous driving, where high accuracy and real-time performance are critical. However, achieving robust detection remains challenging due to scene complexity, occlusion, scale variation, and varying lighting conditions. Over the past two decades, numerous studies have been proposed to address these issues. This study presents a comprehensive and structured survey of image-based vehicle detection methods, systematically comparing classical machine learning techniques based on handcrafted features with modern deep learning approaches. Deep learning methods are categorized into one-stage detectors (e.g., YOLO, SSD, FCOS, CenterNet), two-stage detectors (e.g., Faster R-CNN, Mask R-CNN), transformer-based detectors (e.g., DETR, Swin Transformer), and GAN-based methods, highlighting architectural trade-offs concerning speed, accuracy, and practical deployment. We analyze widely adopted performance metrics from recent studies, evaluate characteristics and limitations of popular vehicle detection datasets, and explicitly discuss technical challenges, including domain generalization, environmental variability, computational constraints, and annotation quality. The survey concludes by clearly identifying open research challenges and promising future directions, such as efficient edge deployment strategies, multimodal data fusion, transformer-based enhancements, and integration with Vehicle-to-Everything (V2X) communication systems. Full article

(This article belongs to the Special Issue Vehicle Safe Motion in Mixed Vehicle Technologies Environment)

► Show Figures

Figure 1

10 pages, 451 KiB

Open AccessFeature PaperArticle

PF2N: Periodicity–Frequency Fusion Network for Multi-Instrument Music Transcription

by Taehyeon Kim, Man-Je Kim and Chang Wook Ahn

Mathematics 2025, 13(11), 1708; https://doi.org/10.3390/math13111708 - 23 May 2025

Viewed by 555

Abstract

Automatic music transcription in multi-instrument settings remains a highly challenging task due to overlapping harmonics and diverse timbres. To address this, we propose the Periodicity–Frequency Fusion Network (PF2N), a lightweight and modular component that enhances transcription performance by integrating both spectral and periodicity-domain [...] Read more.

Automatic music transcription in multi-instrument settings remains a highly challenging task due to overlapping harmonics and diverse timbres. To address this, we propose the Periodicity–Frequency Fusion Network (PF2N), a lightweight and modular component that enhances transcription performance by integrating both spectral and periodicity-domain representations. Inspired by traditional combined frequency and periodicity (CFP) methods, the PF2N reformulates CFP as a neural module that jointly learns harmonically correlated features across the frequency and cepstral domains. Unlike handcrafted alignments in classical approaches, the PF2N performs data-driven fusion using a learnable joint feature extractor. Extensive experiments on three benchmark datasets (Slakh2100, MusicNet, and MAESTRO) demonstrate that the PF2N consistently improves transcription accuracy when incorporated into state-of-the-art models. The results confirm the effectiveness and adaptability of the PF2N, highlighting its potential as a general-purpose enhancement for multi-instrument AMT systems. Full article

(This article belongs to the Section E1: Mathematics and Computer Science)

► Show Figures

Figure 1

22 pages, 11757 KiB

Open AccessArticle

Comparative Study of Cell Nuclei Segmentation Based on Computational and Handcrafted Features Using Machine Learning Algorithms

by Rashadul Islam Sumon, Md Ariful Islam Mozumdar, Salma Akter, Shah Muhammad Imtiyaj Uddin, Mohammad Hassan Ali Al-Onaizan, Reem Ibrahim Alkanhel and Mohammed Saleh Ali Muthanna

Diagnostics 2025, 15(10), 1271; https://doi.org/10.3390/diagnostics15101271 - 16 May 2025

Cited by 1 | Viewed by 797

Abstract

Background: Nuclei segmentation is the first stage of automated microscopic image analysis. The cell nucleus is a crucial aspect in segmenting to gain more insight into cell characteristics and functions that enable computer-aided pathology for early disease detection, such as prostate cancer, breast [...] Read more.

Background: Nuclei segmentation is the first stage of automated microscopic image analysis. The cell nucleus is a crucial aspect in segmenting to gain more insight into cell characteristics and functions that enable computer-aided pathology for early disease detection, such as prostate cancer, breast cancer, brain tumors, and other diagnoses. Nucleus segmentation remains a challenging task despite significant advancements in automated methods. Traditional techniques, such as Otsu thresholding and watershed approaches, are ineffective in challenging scenarios. However, deep learning-based methods exhibit remarkable results across various biological imaging modalities, including computational pathology. Methods: This work explores machine learning approaches for nuclei segmentation by evaluating the quality of nuclei image segmentation. We employed several methods, including K-means clustering, Random Forest (RF), Support Vector Machine (SVM) with handcrafted features, and Logistic Regression (LR) using features derived from Convolutional Neural Networks (CNNs). Handcrafted features extract attributes like the shape, texture, and intensity of nuclei and are meticulously developed based on specialized knowledge. Conversely, CNN-based features are automatically acquired representations that identify complex patterns in nuclei images. To assess how effectively these techniques segment cell nuclei, their performance is evaluated. Results: Experimental results show that Logistic Regression based on CNN-derived features outperforms the other techniques, achieving an accuracy of 96.90%, a Dice coefficient of 74.24, and a Jaccard coefficient of 55.61. In contrast, the Random Forest, Support Vector Machine, and K-means algorithms yielded lower segmentation performance metrics. Conclusions: The conclusions suggest that leveraging CNN-based features in conjunction with Logistic Regression significantly enhances the accuracy of cell nuclei segmentation in pathological images. This approach holds promise for refining computer-aided pathology workflows, potentially leading to more reliable and earlier disease diagnoses. Full article

(This article belongs to the Special Issue Diagnostic Imaging of Prostate Cancer)

► Show Figures

Figure 1

Search Results (326)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (326)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI