MDPI - Publisher of Open Access Journals

23 pages, 3055 KiB

Open AccessArticle

RDPNet: A Multi-Scale Residual Dilated Pyramid Network with Entropy-Based Feature Fusion for Epileptic EEG Classification

by Tongle Xie, Wei Zhao, Yanyouyou Liu and Shixiao Xiao

Entropy 2025, 27(8), 830; https://doi.org/10.3390/e27080830 (registering DOI) - 5 Aug 2025

Abstract

Epilepsy is a prevalent neurological disorder affecting approximately 50 million individuals worldwide. Electroencephalogram (EEG) signals play a vital role in the diagnosis and analysis of epileptic seizures. However, traditional machine learning techniques often rely on handcrafted features, limiting their robustness and generalizability across [...] Read more.

Epilepsy is a prevalent neurological disorder affecting approximately 50 million individuals worldwide. Electroencephalogram (EEG) signals play a vital role in the diagnosis and analysis of epileptic seizures. However, traditional machine learning techniques often rely on handcrafted features, limiting their robustness and generalizability across diverse EEG acquisition settings, seizure types, and patients. To address these limitations, we propose RDPNet, a multi-scale residual dilated pyramid network with entropy-guided feature fusion for automated epileptic EEG classification. RDPNet combines residual convolution modules to extract local features and a dilated convolutional pyramid to capture long-range temporal dependencies. A dual-pathway fusion strategy integrates pooled and entropy-based features from both shallow and deep branches, enabling robust representation of spatial saliency and statistical complexity. We evaluate RDPNet on two benchmark datasets: the University of Bonn and TUSZ. On the Bonn dataset, RDPNet achieves 99.56–100% accuracy in binary classification, 99.29–99.79% in ternary tasks, and 95.10% in five-class classification. On the clinically realistic TUSZ dataset, it reaches a weighted F₁-score of 95.72% across seven seizure types. Compared with several baselines, RDPNet consistently outperforms existing approaches, demonstrating superior robustness, generalizability, and clinical potential for epileptic EEG analysis. Full article

(This article belongs to the Special Issue Complexity, Entropy and the Physics of Information II)

► Show Figures

Figure 1

14 pages, 2727 KiB

Open AccessArticle

A Multimodal MRI-Based Model for Colorectal Liver Metastasis Prediction: Integrating Radiomics, Deep Learning, and Clinical Features with SHAP Interpretation

by Xin Yan, Furui Duan, Lu Chen, Runhong Wang, Kexin Li, Qiao Sun and Kuang Fu

Curr. Oncol. 2025, 32(8), 431; https://doi.org/10.3390/curroncol32080431 - 30 Jul 2025

Viewed by 142

Abstract

Purpose: Predicting colorectal cancer liver metastasis (CRLM) is essential for prognostic assessment. This study aims to develop and validate an interpretable multimodal machine learning framework based on multiparametric MRI for predicting CRLM, and to enhance the clinical interpretability of the model through [...] Read more.

Purpose: Predicting colorectal cancer liver metastasis (CRLM) is essential for prognostic assessment. This study aims to develop and validate an interpretable multimodal machine learning framework based on multiparametric MRI for predicting CRLM, and to enhance the clinical interpretability of the model through SHapley Additive exPlanations (SHAP) analysis and deep learning visualization. Methods: This multicenter retrospective study included 463 patients with pathologically confirmed colorectal cancer from two institutions, divided into training (n = 256), internal testing (n = 111), and external validation (n = 96) sets. Radiomics features were extracted from manually segmented regions on axial T2-weighted imaging (T2WI) and diffusion-weighted imaging (DWI). Deep learning features were obtained from a pretrained ResNet101 network using the same MRI inputs. A least absolute shrinkage and selection operator (LASSO) logistic regression classifier was developed for clinical, radiomics, deep learning, and combined models. Model performance was evaluated by AUC, sensitivity, specificity, and F1-score. SHAP was used to assess feature contributions, and Grad-CAM was applied to visualize deep feature attention. Results: The combined model integrating features across the three modalities achieved the highest performance across all datasets, with AUCs of 0.889 (training), 0.838 (internal test), and 0.822 (external validation), outperforming single-modality models. Decision curve analysis (DCA) revealed enhanced clinical net benefit from the integrated model, while calibration curves confirmed its good predictive consistency. SHAP analysis revealed that radiomic features related to T2WI texture (e.g., LargeDependenceLowGrayLevelEmphasis) and clinical biomarkers (e.g., CA19-9) were among the most predictive for CRLM. Grad-CAM visualizations confirmed that the deep learning model focused on tumor regions consistent with radiological interpretation. Conclusions: This study presents a robust and interpretable multiparametric MRI-based model for noninvasively predicting liver metastasis in colorectal cancer patients. By integrating handcrafted radiomics and deep learning features, and enhancing transparency through SHAP and Grad-CAM, the model provides both high predictive performance and clinically meaningful explanations. These findings highlight its potential value as a decision-support tool for individualized risk assessment and treatment planning in the management of colorectal cancer. Full article

(This article belongs to the Section Gastrointestinal Oncology)

► Show Figures

Graphical abstract

21 pages, 3448 KiB

Open AccessArticle

A Welding Defect Detection Model Based on Hybrid-Enhanced Multi-Granularity Spatiotemporal Representation Learning

by Chenbo Shi, Shaojia Yan, Lei Wang, Changsheng Zhu, Yue Yu, Xiangteng Zang, Aiping Liu, Chun Zhang and Xiaobing Feng

Sensors 2025, 25(15), 4656; https://doi.org/10.3390/s25154656 - 27 Jul 2025

Viewed by 375

Abstract

Real-time quality monitoring using molten pool images is a critical focus in researching high-quality, intelligent automated welding. To address interference problems in molten pool images under complex welding scenarios (e.g., reflected laser spots from spatter misclassified as porosity defects) and the limited interpretability [...] Read more.

Real-time quality monitoring using molten pool images is a critical focus in researching high-quality, intelligent automated welding. To address interference problems in molten pool images under complex welding scenarios (e.g., reflected laser spots from spatter misclassified as porosity defects) and the limited interpretability of deep learning models, this paper proposes a multi-granularity spatiotemporal representation learning algorithm based on the hybrid enhancement of handcrafted and deep learning features. A MobileNetV2 backbone network integrated with a Temporal Shift Module (TSM) is designed to progressively capture the short-term dynamic features of the molten pool and integrate temporal information across both low-level and high-level features. A multi-granularity attention-based feature aggregation module is developed to select key interference-free frames using cross-frame attention, generate multi-granularity features via grouped pooling, and apply the Convolutional Block Attention Module (CBAM) at each granularity level. Finally, these multi-granularity spatiotemporal features are adaptively fused. Meanwhile, an independent branch utilizes the Histogram of Oriented Gradient (HOG) and Scale-Invariant Feature Transform (SIFT) features to extract long-term spatial structural information from historical edge images, enhancing the model’s interpretability. The proposed method achieves an accuracy of 99.187% on a self-constructed dataset. Additionally, it attains a real-time inference speed of 20.983 ms per sample on a hardware platform equipped with an Intel i9-12900H CPU and an RTX 3060 GPU, thus effectively balancing accuracy, speed, and interpretability. Full article

(This article belongs to the Topic Applied Computing and Machine Intelligence (ACMI))

► Show Figures

Figure 1

26 pages, 54898 KiB

Open AccessArticle

MSWF: A Multi-Modal Remote Sensing Image Matching Method Based on a Side Window Filter with Global Position, Orientation, and Scale Guidance

by Jiaqing Ye, Guorong Yu and Haizhou Bao

Sensors 2025, 25(14), 4472; https://doi.org/10.3390/s25144472 - 18 Jul 2025

Viewed by 343

Abstract

Multi-modal remote sensing image (MRSI) matching suffers from severe nonlinear radiometric distortions and geometric deformations, and conventional feature-based techniques are generally ineffective. This study proposes a novel and robust MRSI matching method using the side window filter (MSWF). First, a novel side window [...] Read more.

Multi-modal remote sensing image (MRSI) matching suffers from severe nonlinear radiometric distortions and geometric deformations, and conventional feature-based techniques are generally ineffective. This study proposes a novel and robust MRSI matching method using the side window filter (MSWF). First, a novel side window scale space is constructed based on the side window filter (SWF), which can preserve shared image contours and facilitate the extraction of feature points within this newly defined scale space. Second, noise thresholds in phase congruency (PC) computation are adaptively refined with the Weibull distribution; weighted phase features are then exploited to determine the principal orientation of each point, from which a maximum index map (MIM) descriptor is constructed. Third, coarse position, orientation, and scale information obtained through global matching are employed to estimate image-pair geometry, after which descriptors are recalculated for precise correspondence search. MSWF is benchmarked against eight state-of-the-art multi-modal methods—six hand-crafted (PSO-SIFT, LGHD, RIFT, RIFT2, HAPCG, COFSM) and two learning-based (CMM-Net, RedFeat) methods—on three public datasets. Experiments demonstrate that MSWF consistently achieves the highest number of correct matches (NCM) and the highest rate of correct matches (RCM) while delivering the lowest root mean square error (RMSE), confirming its superiority for challenging MRSI registration tasks. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

21 pages, 4044 KiB

Open AccessArticle

DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop Closing

by Hao Qu, Lilian Zhang, Jun Mao, Junbo Tie, Xiaofeng He, Xiaoping Hu, Yifei Shi and Changhao Chen

Appl. Sci. 2025, 15(14), 7838; https://doi.org/10.3390/app15147838 - 13 Jul 2025

Viewed by 409

Abstract

The performance of visual SLAM in complex, real-world scenarios is often compromised by unreliable feature extraction and matching when using handcrafted features. Although deep learning-based local features excel at capturing high-level information and perform well on matching benchmarks, they struggle with generalization in [...] Read more.

The performance of visual SLAM in complex, real-world scenarios is often compromised by unreliable feature extraction and matching when using handcrafted features. Although deep learning-based local features excel at capturing high-level information and perform well on matching benchmarks, they struggle with generalization in continuous motion scenes, adversely affecting loop detection accuracy. Our system employs a Model-Agnostic Meta-Learning (MAML) strategy to optimize the training of keypoint extraction networks, enhancing their adaptability to diverse environments. Additionally, we introduce a coarse-to-fine feature tracking mechanism for learned keypoints. It begins with a direct method to approximate the relative pose between consecutive frames, followed by a feature matching method for refined pose estimation. To mitigate cumulative positioning errors, DK-SLAM incorporates a novel online learning module that utilizes binary features for loop closure detection. This module dynamically identifies loop nodes within a sequence, ensuring accurate and efficient localization. Experimental evaluations on publicly available datasets demonstrate that DK-SLAM outperforms leading traditional and learning-based SLAM systems, such as ORB-SLAM3 and LIFT-SLAM. DK-SLAM achieves 17.7% better translation accuracy and 24.2% better rotation accuracy than ORB-SLAM3 on KITTI and 34.2% better translation accuracy on EuRoC. These results underscore the efficacy and robustness of our DK-SLAM in varied and challenging real-world environments. Full article

(This article belongs to the Section Robotics and Automation)

► Show Figures

Figure 1

36 pages, 25361 KiB

Open AccessArticle

Remote Sensing Image Compression via Wavelet-Guided Local Structure Decoupling and Channel–Spatial State Modeling

by Jiahui Liu, Lili Zhang and Xianjun Wang

Remote Sens. 2025, 17(14), 2419; https://doi.org/10.3390/rs17142419 - 12 Jul 2025

Viewed by 467

Abstract

As the resolution and data volume of remote sensing imagery continue to grow, achieving efficient compression without sacrificing reconstruction quality remains a major challenge, given that traditional handcrafted codecs often fail to balance rate-distortion performance and computational complexity, while deep learning-based approaches offer [...] Read more.

As the resolution and data volume of remote sensing imagery continue to grow, achieving efficient compression without sacrificing reconstruction quality remains a major challenge, given that traditional handcrafted codecs often fail to balance rate-distortion performance and computational complexity, while deep learning-based approaches offer superior representational capacity. However, challenges remain in achieving a balance between fine-detail adaptation and computational efficiency. Mamba, a state–space model (SSM)-based architecture, offers linear-time complexity and excels at capturing long-range dependencies in sequences. It has been adopted in remote sensing compression tasks to model long-distance dependencies between pixels. However, despite its effectiveness in global context aggregation, Mamba’s uniform bidirectional scanning is insufficient for capturing high-frequency structures such as edges and textures. Moreover, existing visual state–space (VSS) models built upon Mamba typically treat all channels equally and lack mechanisms to dynamically focus on semantically salient spatial regions. To address these issues, we present an innovative architecture for distant sensing image compression, called the Multi-scale Channel Global Mamba Network (MGMNet). MGMNet integrates a spatial–channel dynamic weighting mechanism into the Mamba architecture, enhancing global semantic modeling while selectively emphasizing informative features. It comprises two key modules. The Wavelet Transform-guided Local Structure Decoupling (WTLS) module applies multi-scale wavelet decomposition to disentangle and separately encode low- and high-frequency components, enabling efficient parallel modeling of global contours and local textures. The Channel–Global Information Modeling (CGIM) module enhances conventional VSS by introducing a dual-path attention strategy that reweights spatial and channel information, improving the modeling of long-range dependencies and edge structures. We conducted extensive evaluations on three distinct remote sensing datasets to assess the MGMNet. The results of the investigations revealed that MGMNet outperforms the current SOTA models across various performance metrics. Full article

(This article belongs to the Special Issue New Insights in Remote Sensing Image Interpretation with Deep Learning)

► Show Figures

Figure 1

27 pages, 6828 KiB

Open AccessArticle

A Lightweight Remote-Sensing Image-Change Detection Algorithm Based on Asymmetric Convolution and Attention Coupling

by Enze Zhang, Yan Li, Haifeng Lin and Min Xia

Remote Sens. 2025, 17(13), 2226; https://doi.org/10.3390/rs17132226 - 29 Jun 2025

Viewed by 391

Abstract

Remote-sensing image-change detection is indispensable for land management, environmental monitoring and related applications. In recent years, breakthroughs in satellite sensor technology have generated vast volumes of data and complex scenes, presenting significant challenges for change-detection algorithms. Traditional methods rely on handcrafted features, which [...] Read more.

Remote-sensing image-change detection is indispensable for land management, environmental monitoring and related applications. In recent years, breakthroughs in satellite sensor technology have generated vast volumes of data and complex scenes, presenting significant challenges for change-detection algorithms. Traditional methods rely on handcrafted features, which struggle to address the impacts of multi-source data heterogeneity and imaging condition differences. In this context, technology based on deep learning has made substantial breakthroughs in change-detection performance by automatically extracting high-level feature representations of the data. However, although the existing deep-learning models improve the detection accuracy through end-to-end learning, their high parameter count and computational inefficiency hinder suitability for real-time monitoring and edge device deployment. Therefore, to address the need for lightweight solutions in scenarios with limited computing resources, this paper proposes an attention-based lightweight remote sensing change detection network (ABLRCNet), which achieves a balance between computational efficiency and detection accuracy by using lightweight residual convolution blocks (LRCBs), multi-scale spatial-attention modules (MSAMs) and feature-difference enhancement modules (FDEMs). The experimental results demonstrate that the ABLRCNet achieves excellent performance on three datasets, significantly enhancing both the accuracy and robustness of change detection, while exhibiting efficient detection capabilities in resource-limited scenarios. Full article

(This article belongs to the Special Issue Multi-Task Remote Sensing Image Analysis: Classification, Segmentation, and Change Detection)

► Show Figures

Figure 1

73 pages, 2833 KiB

Open AccessArticle

A Comprehensive Methodological Survey of Human Activity Recognition Across Diverse Data Modalities

by Jungpil Shin, Najmul Hassan, Abu Saleh Musa Miah and Satoshi Nishimura

Sensors 2025, 25(13), 4028; https://doi.org/10.3390/s25134028 - 27 Jun 2025

Cited by 1 | Viewed by 1454

Abstract

Human Activity Recognition (HAR) systems aim to understand human behavior and assign a label to each action, attracting significant attention in computer vision due to their wide range of applications. HAR can leverage various data modalities, such as RGB images and video, skeleton, [...] Read more.

Human Activity Recognition (HAR) systems aim to understand human behavior and assign a label to each action, attracting significant attention in computer vision due to their wide range of applications. HAR can leverage various data modalities, such as RGB images and video, skeleton, depth, infrared, point cloud, event stream, audio, acceleration, and radar signals. Each modality provides unique and complementary information suited to different application scenarios. Consequently, numerous studies have investigated diverse approaches for HAR using these modalities. This survey includes only peer-reviewed research papers published in English to ensure linguistic consistency and academic integrity. This paper presents a comprehensive survey of the latest advancements in HAR from 2014 to 2025, focusing on Machine Learning (ML) and Deep Learning (DL) approaches categorized by input data modalities. We review both single-modality and multi-modality techniques, highlighting fusion-based and co-learning frameworks. Additionally, we cover advancements in hand-crafted action features, methods for recognizing human–object interactions, and activity detection. Our survey includes a detailed dataset description for each modality, as well as a summary of the latest HAR systems, accompanied by a mathematical derivation for evaluating the deep learning model for each modality, and it also provides comparative results on benchmark datasets. Finally, we provide insightful observations and propose effective future research directions in HAR. Full article

(This article belongs to the Special Issue Computer Vision and Sensors-Based Application for Intelligent Systems)

► Show Figures

Figure 1

30 pages, 2018 KiB

Open AccessArticle

Comprehensive Performance Comparison of Signal Processing Features in Machine Learning Classification of Alcohol Intoxication on Small Gait Datasets

by Muxi Qi, Samuel Chibuoyim Uche and Emmanuel Agu

Appl. Sci. 2025, 15(13), 7250; https://doi.org/10.3390/app15137250 - 27 Jun 2025

Viewed by 383

Abstract

Detecting alcohol intoxication is crucial for preventing accidents and enhancing public safety. Traditional intoxication detection methods rely on direct blood alcohol concentration (BAC) measurement via breathalyzers and wearable sensors. These methods require the user to purchase and carry external hardware such as breathalyzers, [...] Read more.

Detecting alcohol intoxication is crucial for preventing accidents and enhancing public safety. Traditional intoxication detection methods rely on direct blood alcohol concentration (BAC) measurement via breathalyzers and wearable sensors. These methods require the user to purchase and carry external hardware such as breathalyzers, which is expensive and cumbersome. Convenient, unobtrusive intoxication detection methods using equipment already owned by users are desirable. Recent research has explored machine learning-based approaches using smartphone accelerometers to classify intoxicated gait patterns. While neural network approaches have emerged, due to the significant challenges with collecting intoxicated gait data, gait datasets are often too small to utilize such approaches. To avoid overfitting on such small datasets, traditional machine learning (ML) classification is preferred. A comprehensive set of ML features have been proposed. However, until now, no work has systematically evaluated the performance of various categories of gait features for alcohol intoxication detection task using traditional machine learning algorithms. This study evaluates 27 signal processing features handcrafted from accelerometer gait data across five domains: time, frequency, wavelet, statistical, and information-theoretic. The data were collected from 24 subjects who experienced alcohol stimulation using goggle busters. Correlation-based feature selection (CFS) was employed to rank the features most correlated with alcohol-induced gait changes, revealing that 22 features exhibited statistically significant correlations with BAC levels. These statistically significant features were utilized to train supervised classifiers and assess their impact on alcohol intoxication detection accuracy. Statistical features yielded the highest accuracy (83.89%), followed by time-domain (83.22%) and frequency-domain features (82.21%). Classifying all domain 22 significant features using a random forest model improved classification accuracy to 84.9%. These findings suggest that incorporating a broader set of signal processing features enhances the accuracy of smartphone-based alcohol intoxication detection. Full article

(This article belongs to the Special Issue AI-Based Biomedical Signal and Image Processing)

► Show Figures

Figure 1

27 pages, 2049 KiB

Open AccessArticle

Optimizing Tumor Detection in Brain MRI with One-Class SVM and Convolutional Neural Network-Based Feature Extraction

by Azeddine Mjahad and Alfredo Rosado-Muñoz

J. Imaging 2025, 11(7), 207; https://doi.org/10.3390/jimaging11070207 - 21 Jun 2025

Viewed by 472

Abstract

The early detection of brain tumors is critical for improving clinical outcomes and patient survival. However, medical imaging datasets frequently exhibit class imbalance, posing significant challenges for traditional classification algorithms that rely on balanced data distributions. To address this issue, this study employs [...] Read more.

The early detection of brain tumors is critical for improving clinical outcomes and patient survival. However, medical imaging datasets frequently exhibit class imbalance, posing significant challenges for traditional classification algorithms that rely on balanced data distributions. To address this issue, this study employs a One-Class Support Vector Machine (OCSVM) trained exclusively on features extracted from healthy brain MRI images, using both deep learning architectures—such as DenseNet121, VGG16, MobileNetV2, InceptionV3, and ResNet50—and classical feature extraction techniques. Experimental results demonstrate that combining Convolutional Neural Network (CNN)-based feature extraction with OCSVM significantly improves anomaly detection performance compared with simpler handcrafted approaches. DenseNet121 achieved an accuracy of 94.83%, a precision of 99.23%, and a sensitivity of 89.97%, while VGG16 reached an accuracy of 95.33%, a precision of 98.87%, and a sensitivity of 91.32%. MobileNetV2 showed a competitive trade-off between accuracy (92.83%) and computational efficiency, making it suitable for resource-constrained environments. Additionally, the pure CNN model—trained directly for classification without OCSVM—outperformed hybrid methods with an accuracy of 97.83%, highlighting the effectiveness of deep convolutional networks in directly learning discriminative features from MRI data. This approach enables reliable detection of brain tumor anomalies without requiring labeled pathological data, offering a promising solution for clinical contexts where abnormal samples are scarce. Future research will focus on reducing inference time, expanding and diversifying training datasets, and incorporating explainability tools to support clinical integration and trust in AI-based diagnostics. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

28 pages, 4483 KiB

Open AccessArticle

Historical Manuscripts Analysis: A Deep Learning System for Writer Identification Using Intelligent Feature Selection with Vision Transformers

by Merouane Boudraa, Akram Bennour, Mouaaz Nahas, Rashiq Rafiq Marie and Mohammed Al-Sarem

J. Imaging 2025, 11(6), 204; https://doi.org/10.3390/jimaging11060204 - 19 Jun 2025

Viewed by 706

Abstract

Identifying the scriptwriter in historical manuscripts is crucial for historians, providing valuable insights into historical contexts and aiding in solving historical mysteries. This research presents a robust deep learning system designed for classifying historical manuscripts by writer, employing intelligent feature selection and vision [...] Read more.

Identifying the scriptwriter in historical manuscripts is crucial for historians, providing valuable insights into historical contexts and aiding in solving historical mysteries. This research presents a robust deep learning system designed for classifying historical manuscripts by writer, employing intelligent feature selection and vision transformers. Our methodology meticulously investigates the efficacy of both handcrafted techniques for feature identification and deep learning architectures for classification tasks in writer identification. The initial preprocessing phase involves thorough document refinement using bilateral filtering for denoising and Otsu thresholding for binarization, ensuring document clarity and consistency for subsequent feature detection. We utilize the FAST detector for feature detection, extracting keypoints representing handwriting styles, followed by clustering with the k-means algorithm to obtain meaningful patches of uniform size. This strategic clustering minimizes redundancy and creates a comprehensive dataset ideal for deep learning classification tasks. Leveraging vision transformer models, our methodology effectively learns complex patterns and features from extracted patches, enabling precise identification of writers across historical manuscripts. This study pioneers the application of vision transformers in historical document analysis, showcasing superior performance on the “ICDAR 2017” dataset compared to state-of-the-art methods and affirming our approach as a robust tool for historical manuscript analysis. Full article

(This article belongs to the Section Document Analysis and Processing)

► Show Figures

Figure 1

23 pages, 4949 KiB

Open AccessArticle

Hybrid LDA-CNN Framework for Robust End-to-End Myoelectric Hand Gesture Recognition Under Dynamic Conditions

by Hongquan Le, Marc in het Panhuis, Geoffrey M. Spinks and Gursel Alici

Robotics 2025, 14(6), 83; https://doi.org/10.3390/robotics14060083 - 17 Jun 2025

Viewed by 871

Abstract

Gesture recognition based on conventional machine learning is the main control approach for advanced prosthetic hand systems. Its primary limitation is the need for feature extraction, which must meet real-time control requirements. On the other hand, deep learning models could potentially overfit when [...] Read more.

Gesture recognition based on conventional machine learning is the main control approach for advanced prosthetic hand systems. Its primary limitation is the need for feature extraction, which must meet real-time control requirements. On the other hand, deep learning models could potentially overfit when trained on small datasets. For these reasons, we propose a hybrid Linear Discriminant Analysis–convolutional neural network (LDA-CNN) framework to improve the gesture recognition performance of sEMG-based prosthetic hand control systems. Within this framework, 1D-CNN filters are trained to generate latent representation that closely approximates Fisher’s (LDA’s) discriminant subspace, constructed from handcrafted features. Under the train-one-test-all evaluation scheme, our proposed hybrid framework consistently outperformed the 1D-CNN trained with cross-entropy loss only, showing improvements from 4% to 11% across two public datasets featuring hand gestures recorded under various limb positions and arm muscle contraction levels. Furthermore, our framework exhibited advantages in terms of induced spectral regularization, which led to a state-of-the-art recognition error of 22.79% with the extended 23 feature set when tested on the multi-limb position dataset. The main novelty of our hybrid framework is that it decouples feature extraction in regard to the inference time, enabling the future incorporation of a more extensive set of features, while keeping the inference computation time minimal. Full article

(This article belongs to the Special Issue AI for Robotic Exoskeletons and Prostheses)

► Show Figures

Figure 1

22 pages, 1961 KiB

Open AccessArticle

Incorporating Implicit and Explicit Feature Fusion into Hybrid Recommendation for Improved Rating Prediction

by Qinglong Li, Euiju Jeong, Seok-Kee Lee and Jiaen Li

Electronics 2025, 14(12), 2384; https://doi.org/10.3390/electronics14122384 - 11 Jun 2025

Viewed by 395

Abstract

Online review texts serve as a valuable source of auxiliary information for addressing the data sparsity problem in recommender systems. These reviews often reflect user preferences across multiple item attributes and can be effectively incorporated into recommendation models to enhance both the accuracy [...] Read more.

Online review texts serve as a valuable source of auxiliary information for addressing the data sparsity problem in recommender systems. These reviews often reflect user preferences across multiple item attributes and can be effectively incorporated into recommendation models to enhance both the accuracy and interpretability of recommendations. Review-based recommendation approaches can be broadly classified into implicit and explicit methods. Implicit methods leverage deep learning techniques to extract latent semantic representations from review texts but generally lack interpretability due to limited transparency in the training process. In contrast, explicit methods rely on hand-crafted features derived from domain knowledge, which offer high explanatory capability but typically capture only shallow information. Integrating the complementary strengths of these two approaches presents a promising direction for improving recommendation performance. However, previous research exploring this integration remains limited. In this study, we propose a novel recommendation model that jointly considers implicit and explicit representations derived from review texts. To this end, we incorporate a self-attention mechanism to emphasize important features from each representation type and utilize Bidirectional Encoder Representations from Transformers (BERT) to capture rich contextual information embedded in the reviews. We evaluate the performance of the proposed model through extensive experiments using three real-world datasets. The experimental results demonstrate that our model outperforms several baseline models, confirming its effectiveness in generating accurate and explainable recommendations. Full article

(This article belongs to the Special Issue AI and Machine Learning in Recommender Systems and Customer Behavior)

► Show Figures

Figure 1

19 pages, 840 KiB

Open AccessArticle

A Dual-Feature Framework for Enhanced Diagnosis of Myeloproliferative Neoplasm Subtypes Using Artificial Intelligence

by Amna Bamaqa, N. S. Labeeb, Eman M. El-Gendy, Hani M. Ibrahim, Mohamed Farsi, Hossam Magdy Balaha, Mahmoud Badawy and Mostafa A. Elhosseini

Bioengineering 2025, 12(6), 623; https://doi.org/10.3390/bioengineering12060623 - 7 Jun 2025

Viewed by 688

Abstract

Myeloproliferative neoplasms, particularly the Philadelphia chromosome-negative (Ph-negative) subtypes such as essential thrombocythemia, polycythemia vera, and primary myelofibrosis, present diagnostic challenges due to overlapping morphological features and clinical heterogeneity. Traditional diagnostic approaches, including imaging and histopathological analysis, are often limited by interobserver variability, delayed [...] Read more.

Myeloproliferative neoplasms, particularly the Philadelphia chromosome-negative (Ph-negative) subtypes such as essential thrombocythemia, polycythemia vera, and primary myelofibrosis, present diagnostic challenges due to overlapping morphological features and clinical heterogeneity. Traditional diagnostic approaches, including imaging and histopathological analysis, are often limited by interobserver variability, delayed diagnosis, and subjective interpretations. To address these limitations, we propose a novel framework that integrates handcrafted and automatic feature extraction techniques for improved classification of Ph-negative myeloproliferative neoplasms. Handcrafted features capture interpretable morphological and textural characteristics. In contrast, automatic features utilize deep learning models to identify complex patterns in histopathological images. The extracted features were used to train machine learning models, with hyperparameter optimization performed using Optuna. Our framework achieved high performance across multiple metrics, including precision, recall, F1 score, accuracy, specificity, and weighted average. The concatenated probabilities, which combine both feature types, demonstrated the highest mean weighted average of 0.9969, surpassing the individual performances of handcrafted (0.9765) and embedded features (0.9686). Statistical analysis confirmed the robustness and reliability of the results. However, challenges remain in assuming normal distributions for certain feature types. This study highlights the potential of combining domain-specific knowledge with data-driven approaches to enhance diagnostic accuracy and support clinical decision-making. Full article

(This article belongs to the Special Issue Deciphering Medicine: The Role of Explainable Artificial Intelligence in Healthcare Innovations, 2nd Edition)

► Show Figures

Figure 1

26 pages, 12177 KiB

Open AccessArticle

An Efficient Hybrid 3D Computer-Aided Cephalometric Analysis for Lateral Cephalometric and Cone-Beam Computed Tomography (CBCT) Systems

by Laurine A. Ashame, Sherin M. Youssef, Mazen Nabil Elagamy and Sahar M. El-Sheikh

Computers 2025, 14(6), 223; https://doi.org/10.3390/computers14060223 - 7 Jun 2025

Viewed by 622

Abstract

Lateral cephalometric analysis is commonly used in orthodontics for skeletal classification to ensure an accurate and reliable diagnosis for treatment planning. However, most current research depends on analyzing different type of radiographs, which requires more computational time than 3D analysis. Consequently, this study [...] Read more.

Lateral cephalometric analysis is commonly used in orthodontics for skeletal classification to ensure an accurate and reliable diagnosis for treatment planning. However, most current research depends on analyzing different type of radiographs, which requires more computational time than 3D analysis. Consequently, this study addresses fully automatic orthodontics tracing based on the usage of artificial intelligence (AI) applied to 2D and 3D images, by designing a cephalometric system that analyzes the significant landmarks and regions of interest (ROI) needed in orthodontics tracing, especially for the mandible and maxilla teeth. In this research, a computerized system is developed to automate the tasks of orthodontics evaluation during 2D and Cone-Beam Computed Tomography (CBCT or 3D) systems measurements. This work was tested on a dataset that contains images of males and females obtained from dental hospitals with patient-informed consent. The dataset consists of 2D lateral cephalometric, panorama and CBCT radiographs. Many scenarios were applied to test the proposed system in landmark prediction and detection. Moreover, this study integrates the Grad-CAM (Gradient-Weighted Class Activation Mapping) technique to generate heat maps, providing transparent visualization of the regions the model focuses on during its decision-making process. By enhancing the interpretability of deep learning predictions, Grad-CAM strengthens clinical confidence in the system’s outputs, ensuring that ROI detection aligns with orthodontic diagnostic standards. This explainability is crucial in medical AI applications, where understanding model behavior is as important as achieving high accuracy. The experimental results achieved an accuracy exceeding 98.9%. This research evaluates and differentiates between the two-dimensional and the three-dimensional tracing analyses applied to measurements based on the practices of the European Board of Orthodontics. The results demonstrate the proposed methodology’s robustness when applied to cephalometric images. Furthermore, the evaluation of 3D analysis usage provides a clear understanding of the significance of integrated deep-learning techniques in orthodontics. Full article

(This article belongs to the Special Issue Machine Learning Applications in Pattern Recognition)

► Show Figures

Figure 1

Search Results (481)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (481)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI