MDPI - Publisher of Open Access Journals

25 pages, 2887 KiB

Open AccessArticle

Federated Learning Based on an Internet of Medical Things Framework for a Secure Brain Tumor Diagnostic System: A Capsule Networks Application

by Roman Rodriguez-Aguilar, Jose-Antonio Marmolejo-Saucedo and Utku Köse

Mathematics 2025, 13(15), 2393; https://doi.org/10.3390/math13152393 - 25 Jul 2025

Viewed by 189

Abstract

Artificial intelligence (AI) has already played a significant role in the healthcare sector, particularly in image-based medical diagnosis. Deep learning models have produced satisfactory and useful results for accurate decision-making. Among the various types of medical images, magnetic resonance imaging (MRI) is frequently [...] Read more.

Artificial intelligence (AI) has already played a significant role in the healthcare sector, particularly in image-based medical diagnosis. Deep learning models have produced satisfactory and useful results for accurate decision-making. Among the various types of medical images, magnetic resonance imaging (MRI) is frequently utilized in deep learning applications to analyze detailed structures and organs in the body, using advanced intelligent software. However, challenges related to performance and data privacy often arise when using medical data from patients and healthcare institutions. To address these issues, new approaches have emerged, such as federated learning. This technique ensures the secure exchange of sensitive patient and institutional data. It enables machine learning or deep learning algorithms to establish a client–server relationship, whereby specific parameters are securely shared between models while maintaining the integrity of the learning tasks being executed. Federated learning has been successfully applied in medical settings, including diagnostic applications involving medical images such as MRI data. This research introduces an analytical intelligence system based on an Internet of Medical Things (IoMT) framework that employs federated learning to provide a safe and effective diagnostic solution for brain tumor identification. By utilizing specific brain MRI datasets, the model enables multiple local capsule networks (CapsNet) to achieve improved classification results. The average accuracy rate of the CapsNet model exceeds 97%. The precision rate indicates that the CapsNet model performs well in accurately predicting true classes. Additionally, the recall findings suggest that this model is effective in detecting the target classes of meningiomas, pituitary tumors, and gliomas. The integration of these components into an analytical intelligence system that supports the work of healthcare personnel is the main contribution of this work. Evaluations have shown that this approach is effective for diagnosing brain tumors while ensuring data privacy and security. Moreover, it represents a valuable tool for enhancing the efficiency of the medical diagnostic process. Full article

(This article belongs to the Special Issue Innovations in Optimization and Operations Research)

► Show Figures

Figure 1

18 pages, 1223 KiB

Open AccessArticle

GazeCapsNet: A Lightweight Gaze Estimation Framework

by Shakhnoza Muksimova, Yakhyokhuja Valikhujaev, Sabina Umirzakova, Jushkin Baltayev and Young Im Cho

Sensors 2025, 25(4), 1224; https://doi.org/10.3390/s25041224 - 17 Feb 2025

Cited by 1 | Viewed by 1556

Abstract

Gaze estimation is increasingly pivotal in applications spanning virtual reality, augmented reality, and driver monitoring systems, necessitating efficient yet accurate models for mobile deployment. Current methodologies often fall short, particularly in mobile settings, due to their extensive computational requirements or reliance on intricate [...] Read more.

Gaze estimation is increasingly pivotal in applications spanning virtual reality, augmented reality, and driver monitoring systems, necessitating efficient yet accurate models for mobile deployment. Current methodologies often fall short, particularly in mobile settings, due to their extensive computational requirements or reliance on intricate pre-processing. Addressing these limitations, we present Mobile-GazeCapsNet, an innovative gaze estimation framework that harnesses the strengths of capsule networks and integrates them with lightweight architectures such as MobileNet v2, MobileOne, and ResNet-18. This framework not only eliminates the need for facial landmark detection but also significantly enhances real-time operability on mobile devices. Through the innovative use of Self-Attention Routing, GazeCapsNet dynamically allocates computational resources, thereby improving both accuracy and efficiency. Our results demonstrate that GazeCapsNet achieves competitive performance by optimizing capsule networks for gaze estimation through Self-Attention Routing (SAR), which replaces iterative routing with a lightweight attention-based mechanism, improving computational efficiency. Our results show that GazeCapsNet achieves state-of-the-art (SOTA) performance on several benchmark datasets, including ETH-XGaze and Gaze360, achieving a mean angular error (MAE) reduction of up to 15% compared to existing models. Furthermore, the model maintains a real-time processing capability of 20 milliseconds per frame while requiring only 11.7 million parameters, making it exceptionally suitable for real-time applications in resource-constrained environments. These findings not only underscore the efficacy and practicality of GazeCapsNet but also establish a new standard for mobile gaze estimation technologies. Full article

(This article belongs to the Section Sensor Networks)

► Show Figures

Figure 1

24 pages, 10895 KiB

Open AccessArticle

Orthogonal Capsule Network with Meta-Reinforcement Learning for Small Sample Hyperspectral Image Classification

by Prince Yaw Owusu Amoako, Guo Cao, Boshan Shi, Di Yang and Benedict Boakye Acka

Remote Sens. 2025, 17(2), 215; https://doi.org/10.3390/rs17020215 - 9 Jan 2025

Cited by 3 | Viewed by 1183

Abstract

Most current hyperspectral image classification (HSIC) models require a large number of training samples, and when the sample size is small, the classification performance decreases. To address this issue, we propose an innovative model that combines an orthogonal capsule network with meta-reinforcement learning [...] Read more.

Most current hyperspectral image classification (HSIC) models require a large number of training samples, and when the sample size is small, the classification performance decreases. To address this issue, we propose an innovative model that combines an orthogonal capsule network with meta-reinforcement learning (OCN-MRL) for small sample HSIC. The OCN-MRL framework employs Meta-RL for feature selection and CapsNet for classification with a small data sample. The Meta-RL module through clustering, augmentation, and multiview techniques enables the model to adapt to new HSIC tasks with limited samples. Learning a meta-policy with a Q-learner generalizes across different tasks to effectively select discriminative features from the hyperspectral data. Integrating orthogonality into CapsNet reduces the network complexity while maintaining the ability to preserve spatial hierarchies and relationships in the data with a 3D convolution layer, suitably capturing complex patterns. Experimental results on four rich Chinese hyperspectral datasets demonstrate the OCN-MRL model’s competitiveness in both higher classification accuracy and less computational cost compared to existing CapsNet-based methods. Full article

► Show Figures

Graphical abstract

20 pages, 3238 KiB

Open AccessArticle

Enhanced Disc Herniation Classification Using Grey Wolf Optimization Based on Hybrid Feature Extraction and Deep Learning Methods

by Yasemin Sarı and Nesrin Aydın Atasoy

Tomography 2025, 11(1), 1; https://doi.org/10.3390/tomography11010001 - 26 Dec 2024

Viewed by 1245

Abstract

Due to the increasing number of people working at computers in professional settings, the incidence of lumbar disc herniation is increasing. Background/Objectives: The early diagnosis and treatment of lumbar disc herniation is much more likely to yield favorable results, allowing the hernia to [...] Read more.

Due to the increasing number of people working at computers in professional settings, the incidence of lumbar disc herniation is increasing. Background/Objectives: The early diagnosis and treatment of lumbar disc herniation is much more likely to yield favorable results, allowing the hernia to be treated before it develops further. The aim of this study was to classify lumbar disc herniations in a computer-aided, fully automated manner using magnetic resonance images (MRIs). Methods: This study presents a hybrid method integrating residual network (ResNet50), grey wolf optimization (GWO), and machine learning classifiers such as multi-layer perceptron (MLP) and support vector machine (SVM) to improve classification performance. The proposed approach begins with feature extraction using ResNet50, a deep convolutional neural network known for its robust feature representation capabilities. ResNet50’s residual connections allow for effective training and high-quality feature extraction from input images. Following feature extraction, the GWO algorithm, inspired by the social hierarchy and hunting behavior of grey wolves, is employed to optimize the feature set by selecting the most relevant features. Finally, the optimized feature set is fed into machine learning classifiers (MLP and SVM) for classification. The use of various activation functions (e.g., ReLU, identity, logistic, and tanh) in MLP and various kernel functions (e.g., linear, rbf, sigmoid, and polynomial) in SVM allows for a thorough evaluation of the classifiers’ performance. Results: The proposed methodology demonstrates significant improvements in metrics such as accuracy, precision, recall, and F1 score, outperforming traditional approaches in several cases. These results highlight the effectiveness of combining deep learning-based feature extraction with optimization and machine learning classifiers. Conclusions: Compared to other methods, such as capsule networks (CapsNet), EfficientNetB6, and DenseNet169, the proposed ResNet50-GWO-SVM approach achieved superior performance across all metrics, including accuracy, precision, recall, and F1 score, demonstrating its robustness and effectiveness in classification tasks. Full article

(This article belongs to the Topic Deep Learning for Medical Image Analysis and Medical Natural Language Processing)

► Show Figures

Figure 1

33 pages, 3678 KiB

Open AccessArticle

A Step Towards Neuroplasticity: Capsule Networks with Self-Building Skip Connections

by Nikolai A. K. Steur and Friedhelm Schwenker

AI 2025, 6(1), 1; https://doi.org/10.3390/ai6010001 - 24 Dec 2024

Viewed by 1554

Abstract

Background: Integrating nonlinear behavior into the architecture of artificial neural networks is regarded as essential requirement to constitute their effectual learning capacity for solving complex tasks. This claim seems to be true for moderate-sized networks, i.e., with a lower double-digit number of layers. [...] Read more.

Background: Integrating nonlinear behavior into the architecture of artificial neural networks is regarded as essential requirement to constitute their effectual learning capacity for solving complex tasks. This claim seems to be true for moderate-sized networks, i.e., with a lower double-digit number of layers. However, going deeper with neural networks regularly turns into destructive tendencies of gradual performance degeneration during training. To circumvent this degradation problem, the prominent neural architectures Residual Network and Highway Network establish skip connections with additive identity mappings between layers. Methods: In this work, we unify the mechanics of both architectures into Capsule Networks (CapsNet)s by showing their inherent ability to learn skip connections. As a necessary precondition, we introduce the concept of Adaptive Nonlinearity Gates (ANG)s which dynamically steer and limit the usage of nonlinear processing. We propose practical methods for the realization of ANGs including biased batch normalization, the Doubly-Parametric ReLU (D-PReLU) activation function, and Gated Routing (GR) dedicated to extremely deep CapsNets. Results: Our comprehensive empirical study using MNIST substantiates the effectiveness of our developed methods and delivers valuable insights for the training of very deep nets of any kind. The final experiments on Fashion-MNIST and SVHN demonstrate the potential of pure capsule-driven networks with GR. Full article

► Show Figures

Figure 1

17 pages, 3956 KiB

Open AccessArticle

EEG–fNIRS-Based Emotion Recognition Using Graph Convolution and Capsule Attention Network

by Guijun Chen, Yue Liu and Xueying Zhang

Brain Sci. 2024, 14(8), 820; https://doi.org/10.3390/brainsci14080820 - 16 Aug 2024

Cited by 6 | Viewed by 3892

Abstract

Electroencephalogram (EEG) and functional near-infrared spectroscopy (fNIRS) can objectively reflect a person’s emotional state and have been widely studied in emotion recognition. However, the effective feature fusion and discriminative feature learning from EEG–fNIRS data is challenging. In order to improve the accuracy of [...] Read more.

Electroencephalogram (EEG) and functional near-infrared spectroscopy (fNIRS) can objectively reflect a person’s emotional state and have been widely studied in emotion recognition. However, the effective feature fusion and discriminative feature learning from EEG–fNIRS data is challenging. In order to improve the accuracy of emotion recognition, a graph convolution and capsule attention network model (GCN-CA-CapsNet) is proposed. Firstly, EEG–fNIRS signals are collected from 50 subjects induced by emotional video clips. And then, the features of the EEG and fNIRS are extracted; the EEG–fNIRS features are fused to generate higher-quality primary capsules by graph convolution with the Pearson correlation adjacency matrix. Finally, the capsule attention module is introduced to assign different weights to the primary capsules, and higher-quality primary capsules are selected to generate better classification capsules in the dynamic routing mechanism. We validate the efficacy of the proposed method on our emotional EEG–fNIRS dataset with an ablation study. Extensive experiments demonstrate that the proposed GCN-CA-CapsNet method achieves a more satisfactory performance against the state-of-the-art methods, and the average accuracy can increase by 3–11%. Full article

(This article belongs to the Section Cognitive, Social and Affective Neuroscience)

► Show Figures

Figure 1

17 pages, 16956 KiB

Open AccessArticle

Motor Fault Diagnosis Using Attention-Based Multisensor Feature Fusion

by Zhuoyao Miao, Wenshan Feng, Zhuo Long, Gongping Wu, Le Deng, Xuan Zhou and Liwei Xie

Energies 2024, 17(16), 4053; https://doi.org/10.3390/en17164053 - 15 Aug 2024

Cited by 1 | Viewed by 1289

Abstract

In order to reduce the influence of environmental noise and different operating conditions on the accuracy of motor fault diagnosis, this paper proposes a capsule network method combining multi-channel signals and the efficient channel attention (ECA) mechanism, sampling the data from multiple sensors [...] Read more.

In order to reduce the influence of environmental noise and different operating conditions on the accuracy of motor fault diagnosis, this paper proposes a capsule network method combining multi-channel signals and the efficient channel attention (ECA) mechanism, sampling the data from multiple sensors and visualizing the one-dimensional time-frequency domain as a two-dimensional symmetric dot pattern (SDP) image, then fusing the multi-channel image data and extracting the image using a capsule network combining the ECA attention mechanism features to match eight different fault types for fault classification. In order to guarantee the universality of the suggested model, data from Case Western Reserve University (CWRU) is used for validation. The suggested multi-channel signal fusion ECA attention capsule network (MSF-ECA-CapsNet) model fault identification accuracy may reach 99.21%, according to the experimental findings, which is higher than the traditional method. Meanwhile, the method of multi-sensor data fusion and the use of the ECA attention mechanism make the diagnosis accuracy much higher. Full article

(This article belongs to the Section F: Electrical Engineering)

► Show Figures

Figure 1

17 pages, 2393 KiB

Open AccessArticle

A Modified Bio-Inspired Optimizer with Capsule Network for Diagnosis of Alzheimer Disease

by Praveena Ganesan, G. P. Ramesh, C. Puttamdappa and Yarlagadda Anuradha

Appl. Sci. 2024, 14(15), 6798; https://doi.org/10.3390/app14156798 - 4 Aug 2024

Cited by 34 | Viewed by 1662

Abstract

Recently, Alzheimer’s disease (AD) is one of the common neurodegenerative disorders, which primarily occurs in old age. Structural magnetic resonance imaging (sMRI) is an effective imaging technique used in clinical practice for determining the period of AD patients. An efficient deep learning framework [...] Read more.

Recently, Alzheimer’s disease (AD) is one of the common neurodegenerative disorders, which primarily occurs in old age. Structural magnetic resonance imaging (sMRI) is an effective imaging technique used in clinical practice for determining the period of AD patients. An efficient deep learning framework is proposed in this paper for AD detection, which is inspired from clinical practice. The proposed deep learning framework significantly enhances the performance of AD classification by requiring less processing time. Initially, in the proposed framework, the sMRI images are acquired from a real-time dataset and two online datasets including Australian Imaging, Biomarker and Lifestyle flagship work of ageing (AIBL), and Alzheimer’s Disease Neuroimaging Initiative (ADNI). Next, a fuzzy-based superpixel-clustering algorithm is introduced to segment the region of interest (RoI) in sMRI images. Then, the informative deep features are extracted in segmented RoI images by integrating the probabilistic local ternary pattern (PLTP), ResNet-50, and Visual Geometry Group (VGG)-16. Furthermore, the dimensionality reduction is accomplished by through the modified gorilla troops optimizer (MGTO). This process not only enhances the classification performance but also diminishes the processing time of the capsule network (CapsNet), which is employed to classify the classes of AD. In the MGTO algorithm, a quasi-reflection-based learning (QRBL) process is introduced for generating silverback’s quasi-refraction position for further improving the optimal position’s quality. The proposed fuzzy based superpixel-clustering algorithm and MGTO-CapsNet model obtained a pixel accuracy of 0.96, 0.94, and 0.98 and a classification accuracy of 99.88%, 96.38%, and 99.94% on the ADNI, real-time, and AIBL datasets, respectively. Full article

► Show Figures

Figure 1

19 pages, 5134 KiB

Open AccessArticle

Attribute Feature Perturbation-Based Augmentation of SAR Target Data

by Rubo Jin, Jianda Cheng, Wei Wang, Huiqiang Zhang and Jun Zhang

Sensors 2024, 24(15), 5006; https://doi.org/10.3390/s24155006 - 2 Aug 2024

Cited by 1 | Viewed by 1170

Abstract

Large-scale, diverse, and high-quality data are the basis and key to achieving a good generalization of target detection and recognition algorithms based on deep learning. However, the existing methods for the intelligent augmentation of synthetic aperture radar (SAR) images are confronted with several [...] Read more.

Large-scale, diverse, and high-quality data are the basis and key to achieving a good generalization of target detection and recognition algorithms based on deep learning. However, the existing methods for the intelligent augmentation of synthetic aperture radar (SAR) images are confronted with several issues, including training instability, inferior image quality, lack of physical interpretability, etc. To solve the above problems, this paper proposes a feature-level SAR target-data augmentation method. First, an enhanced capsule neural network (CapsNet) is proposed and employed for feature extraction, decoupling the attribute information of input data. Moreover, an attention mechanism-based attribute decoupling framework is used, which is beneficial for achieving a more effective representation of features. After that, the decoupled attribute feature, including amplitude, elevation angle, azimuth angle, and shape, can be perturbed to increase the diversity of features. On this basis, the augmentation of SAR target images is realized by reconstructing the perturbed features. In contrast to the augmentation methods using random noise as input, the proposed method realizes the mapping from the input of known distribution to the change in unknown distribution. This mapping method reduces the correlation distance between the input signal and the augmented data, therefore diminishing the demand for training data. In addition, we combine pixel loss and perceptual loss in the reconstruction process, which improves the quality of the augmented SAR data. The evaluation of the real and augmented images is conducted using four assessment metrics. The images generated by this method achieve a peak signal-to-noise ratio (PSNR) of 21.6845, radiometric resolution (RL) of 3.7114, and dynamic range (DR) of 24.0654. The experimental results demonstrate the superior performance of the proposed method. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

15 pages, 1537 KiB

Open AccessArticle

3Cs: Unleashing Capsule Networks for Robust COVID-19 Detection Using CT Images

by Rawan Alaufi, Felwa Abukhodair and Manal Kalkatawi

COVID 2024, 4(8), 1113-1127; https://doi.org/10.3390/covid4080077 - 24 Jul 2024

Cited by 1 | Viewed by 1228

Abstract

The COVID-19 pandemic has spread worldwide for over two years. It was considered a significant threat to global health due to its transmissibility and high pathogenicity. The standard test for COVID-19, namely, reverse transcription polymerase chain reaction (RT–PCR), is somehow inaccurate and might [...] Read more.

The COVID-19 pandemic has spread worldwide for over two years. It was considered a significant threat to global health due to its transmissibility and high pathogenicity. The standard test for COVID-19, namely, reverse transcription polymerase chain reaction (RT–PCR), is somehow inaccurate and might have a high false-negative rate (FNR). As a result, an infected person with a negative test result may unknowingly continue to spread the virus, especially if they are infected with an undiscovered COVID-19 strain. Thus, a more accurate diagnostic technique is required. In this study, we propose 3Cs, which is a capsule neural network (CapsNet) used to classify computed tomography (CT) images as novel coronavirus pneumonia (NCP), common pneumonia (CP), or normal lungs. Using 6123 CT images of healthy patients’ lungs and those of patients with CP and NCP, the 3Cs method achieved an accuracy of around 98% and an FNR of about 2%, demonstrating CapNet’s ability to extract features from CT images that distinguish between healthy and infected lungs. This research confirmed that using CapsNet to detect COVID-19 from CT images results in a lower FNR compared to RT–PCR. Thus, it can be used in conjunction with RT–PCR to diagnose COVID-19 regardless of the variant. Full article

(This article belongs to the Special Issue Artificial Intelligence and Machine Learning Applications for Developing the Diagnosis of COVID-19)

► Show Figures

Figure 1

27 pages, 4708 KiB

Open AccessArticle

Using Segmentation to Boost Classification Performance and Explainability in CapsNets

by Dominik Vranay, Maroš Hliboký, László Kovács and Peter Sinčák

Mach. Learn. Knowl. Extr. 2024, 6(3), 1439-1465; https://doi.org/10.3390/make6030068 - 28 Jun 2024

Cited by 1 | Viewed by 1887

Abstract

In this paper, we present Combined-CapsNet (C-CapsNet), a novel approach aimed at enhancing the performance and explainability of Capsule Neural Networks (CapsNets) in image classification tasks. Our method involves the integration of segmentation masks as reconstruction targets within the CapsNet architecture. This integration [...] Read more.

In this paper, we present Combined-CapsNet (C-CapsNet), a novel approach aimed at enhancing the performance and explainability of Capsule Neural Networks (CapsNets) in image classification tasks. Our method involves the integration of segmentation masks as reconstruction targets within the CapsNet architecture. This integration helps in better feature extraction by focusing on significant image parts while reducing the number of parameters required for accurate classification. C-CapsNet combines principles from Efficient-CapsNet and the original CapsNet, introducing several novel improvements such as the use of segmentation masks to reconstruct images and a number of tweaks to the routing algorithm, which enhance both classification accuracy and interoperability. We evaluated C-CapsNet using the Oxford-IIIT Pet and SIIM-ACR Pneumothorax datasets, achieving mean F1 scores of 93% and 67%, respectively. These results demonstrate a significant performance improvement over traditional CapsNet and CNN models. The method’s effectiveness is further highlighted by its ability to produce clear and interpretable segmentation masks, which can be used to validate the network’s focus during classification tasks. Our findings suggest that C-CapsNet not only improves the accuracy of CapsNets but also enhances their explainability, making them more suitable for real-world applications, particularly in medical imaging. Full article

(This article belongs to the Section Network)

► Show Figures

Figure 1

19 pages, 3496 KiB

Open AccessArticle

Capsule Broad Learning System Network for Robust Synthetic Aperture Radar Automatic Target Recognition with Small Samples

by Cuilin Yu, Yikui Zhai, Haifeng Huang, Qingsong Wang and Wenlve Zhou

Remote Sens. 2024, 16(9), 1526; https://doi.org/10.3390/rs16091526 - 26 Apr 2024

Cited by 2 | Viewed by 1375

Abstract

The utilization of deep learning in Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) has witnessed a recent surge owing to its remarkable feature extraction capabilities. Nonetheless, deep learning methodologies are often encumbered by inadequacies in labeled data and the protracted nature of [...] Read more.

The utilization of deep learning in Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) has witnessed a recent surge owing to its remarkable feature extraction capabilities. Nonetheless, deep learning methodologies are often encumbered by inadequacies in labeled data and the protracted nature of training processes. To address these challenges and offer an alternative avenue for accurately extracting image features, this paper puts forth a novel and distinctive network dubbed the Capsule Broad Learning System Network for robust SAR ATR (CBLS-SARNET). This novel strategy is specifically tailored to cater to small-sample SAR ATR scenarios. On the one hand, we introduce a United Division Co-training (UDC) Framework as a feature filter, adeptly amalgamating CapsNet and the Broad Learning System (BLS) to enhance network efficiency and efficacy. On the other hand, we devise a Parameters Sharing (PS) network to facilitate secondary learning by sharing the weight and bias of BLS node layers, thereby augmenting the recognition capability of CBLS-SARNET. Experimental results unequivocally demonstrate that our proposed CBLS-SARNET outperforms other deep learning methods in terms of recognition accuracy and training time. Furthermore, experiments validate the generalization and robustness of our novel method under various conditions, including the addition of blur, Gaussian noise, noisy labels, and different depression angles. These findings underscore the superior generalization capabilities of CBLS-SARNET across diverse SAR ATR scenarios. Full article

► Show Figures

Figure 1

21 pages, 4421 KiB

Open AccessArticle

Research on a Capsule Network Text Classification Method with a Self-Attention Mechanism

by Xiaodong Yu, Shun-Nain Luo, Yujia Wu, Zhufei Cai, Ta-Wen Kuan and Shih-Pang Tseng

Symmetry 2024, 16(5), 517; https://doi.org/10.3390/sym16050517 - 24 Apr 2024

Cited by 3 | Viewed by 1979

Abstract

Convolutional neural networks (CNNs) need to replicate feature detectors when modeling spatial information, which reduces their efficiency. The number of replicated feature detectors or labeled training data required for such methods grows exponentially with the dimensionality of the data being used. On the [...] Read more.

Convolutional neural networks (CNNs) need to replicate feature detectors when modeling spatial information, which reduces their efficiency. The number of replicated feature detectors or labeled training data required for such methods grows exponentially with the dimensionality of the data being used. On the other hand, space-insensitive methods are difficult to encode and express effectively due to the limitation of their rich text structures. In response to the above problems, this paper proposes a capsule network (self-attention capsule network, or SA-CapsNet) with a self-attention mechanism for text classification tasks, wherein the capsule network itself, given the feature with the symmetry hint on two ends, acts as both encoder and decoder. In order to learn long-distance dependent features in sentences and encode text information more efficiently, SA-CapsNet maps the self-attention module to the feature extraction layer of the capsule network, thereby increasing its feature extraction ability and overcoming the limitations of convolutional neural networks. In addition, in this study, in order to improve the accuracy of the model, the capsule was improved by reducing its dimension and an intermediate layer was added, enabling the model to obtain more expressive instantiation features in a given sentence. Finally, experiments were carried out on three general datasets of different sizes, namely the IMDB, MPQA, and MR datasets. The accuracy of the model on these three datasets was 84.72%, 80.31%, and 75.38%, respectively. Furthermore, compared with the benchmark algorithm, the model’s performance on these datasets was promising, with an increase in accuracy of 1.08%, 0.39%, and 1.43%, respectively. This study focused on reducing the parameters of the model for various applications, such as edge and mobile applications. The experimental results show that the accuracy is still not apparently decreased by the reduced parameters. The experimental results therefore verify the effective performance of the proposed SA-CapsNet model. Full article

(This article belongs to the Special Issue Advances in Computer Vision, Pattern Recognition, Machine Learning and Symmetry)

► Show Figures

Figure 1

19 pages, 6233 KiB

Open AccessArticle

Fault Diagnosis for Power Batteries Based on a Stacked Sparse Autoencoder and a Convolutional Block Attention Capsule Network

by Juan Zhou, Shun Zhang and Peng Wang

Processes 2024, 12(4), 816; https://doi.org/10.3390/pr12040816 - 18 Apr 2024

Cited by 4 | Viewed by 1912

Abstract

The power battery constitutes the fundamental component of new energy vehicles. Rapid and accurate fault diagnosis of power batteries can effectively improve the safety and power performance of the vehicle. In response to the issues of limited generalization ability and suboptimal diagnostic accuracy [...] Read more.

The power battery constitutes the fundamental component of new energy vehicles. Rapid and accurate fault diagnosis of power batteries can effectively improve the safety and power performance of the vehicle. In response to the issues of limited generalization ability and suboptimal diagnostic accuracy observed in traditional power battery fault diagnosis models, this study proposes a fault diagnosis method utilizing a Convolutional Block Attention Capsule Network (CBAM-CapsNet) based on a stacked sparse autoencoder (SSAE). The reconstructed dataset is initially input into the SSAE model. Layer-by-layer greedy learning using unsupervised learning is employed, combining unsupervised learning methods with parameter updating and local fine-tuning to enhance visualization capabilities. The CBAM is then integrated into the CapsNet, which not only mitigates the effect of noise on the SSAE but also improves the model’s ability to characterize power cell features, completing the fault diagnosis process. The experimental comparison results show that the proposed method can diagnose power battery failure modes with an accuracy of 96.86%, and various evaluation indexes are superior to CNN, CapsNet, CBAM-CapsNet, and other neural networks at accurately identifying fault types with higher diagnostic accuracy and robustness. Full article

(This article belongs to the Special Issue Manufacturing Processes: Enhancements through Smart and Sustainable Approaches)

► Show Figures

Figure 1

20 pages, 27165 KiB

Open AccessArticle

MES-CTNet: A Novel Capsule Transformer Network Base on a Multi-Domain Feature Map for Electroencephalogram-Based Emotion Recognition

by Yuxiao Du, Han Ding, Min Wu, Feng Chen and Ziman Cai

Brain Sci. 2024, 14(4), 344; https://doi.org/10.3390/brainsci14040344 - 30 Mar 2024

Cited by 5 | Viewed by 2219

Abstract

Emotion recognition using the electroencephalogram (EEG) has garnered significant attention within the realm of human–computer interaction due to the wealth of genuine emotional data stored in EEG signals. However, traditional emotion recognition methods are deficient in mining the connection between multi-domain features and [...] Read more.

Emotion recognition using the electroencephalogram (EEG) has garnered significant attention within the realm of human–computer interaction due to the wealth of genuine emotional data stored in EEG signals. However, traditional emotion recognition methods are deficient in mining the connection between multi-domain features and fitting their advantages. In this paper, we propose a novel capsule Transformer network based on a multi-domain feature for EEG-based emotion recognition, referred to as MES-CTNet. The model’s core consists of a multichannel capsule neural network(CapsNet) embedded with ECA (Efficient Channel Attention) and SE (Squeeze and Excitation) blocks and a Transformer-based temporal coding layer. Firstly, a multi-domain feature map is constructed by combining the space–frequency–time characteristics of the multi-domain features as inputs to the model. Then, the local emotion features are extracted from the multi-domain feature maps by the improved CapsNet. Finally, the Transformer-based temporal coding layer is utilized to globally perceive the emotion feature information of the continuous time slices to obtain a final emotion state. The paper fully experimented on two standard datasets with different emotion labels, the DEAP and SEED datasets. On the DEAP dataset, MES-CTNet achieved an average accuracy of 98.31% in the valence dimension and 98.28% in the arousal dimension; it achieved 94.91% for the cross-session task on the SEED dataset, demonstrating superior performance compared to traditional EEG emotion recognition methods. The MES-CTNet method, utilizing a multi-domain feature map as proposed herein, offers a broader observation perspective for EEG-based emotion recognition. It significantly enhances the classification recognition rate, thereby holding considerable theoretical and practical value in the EEG emotion recognition domain. Full article

(This article belongs to the Section Computational Neuroscience, Neuroinformatics, and Neurocomputing)

► Show Figures

Figure 1

Search Results (69)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (69)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI