sensors-logo

Journal Browser

Journal Browser

AI-Based Automated Recognition and Detection in Healthcare

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Intelligent Sensors".

Deadline for manuscript submissions: 31 August 2025 | Viewed by 17616

Special Issue Editors


E-Mail Website
Guest Editor
Department of Engineering and Mathematics, Sheffield Hallam University, Sheffield S1 1WB, UK
Interests: health monitoring service platform; DL; Internet of Things EEG; electroencephalography; biofeedback; analysis; neuroscience
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Mathematics, University of Southern Queensland, Toowoomba, Australia
Interests: artificial intelligence

Special Issue Information

Dear Colleagues,

AI-based computer-aided diagnosis relies on the recognition and detection of disease symptoms from medical data. Evaluating AI models is crucial for driving progress and fostering competition. While traditional evaluation metrics like accuracy, sensitivity, and specificity are widely understood, they often overlook biases and noise present in the training data. This lack of consideration leads to ambiguity, making it challenging to compare and compete with AI-based solutions. In this Special Issue, we seek papers that surpass standard performance reporting. This can be achieved through preprocessing methods that identify biases and noise in the training data or through post-processing techniques that enhance performance measures with explainability. For this SI, we specify medical data as coming from sensors and taking the form of images and physiological signals.

Dr. Ningrong Lei
Dr. Oliver Faust
Prof. Dr. U Rajendra Acharya
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • bias and noise
  • explainability
  • computer aided diagnosis
  • artificial intelligence

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

15 pages, 3122 KiB  
Article
Simultaneous Speech and Eating Behavior Recognition Using Data Augmentation and Two-Stage Fine-Tuning
by Toshihiro Tsukagoshi, Masafumi Nishida and Masafumi Nishimura
Sensors 2025, 25(5), 1544; https://doi.org/10.3390/s25051544 - 2 Mar 2025
Viewed by 617
Abstract
Speaking and eating are essential components of health management. To enable the daily monitoring of these behaviors, systems capable of simultaneously recognizing speech and eating behaviors are required. However, due to the distinct acoustic and contextual characteristics of these two domains, achieving high-precision [...] Read more.
Speaking and eating are essential components of health management. To enable the daily monitoring of these behaviors, systems capable of simultaneously recognizing speech and eating behaviors are required. However, due to the distinct acoustic and contextual characteristics of these two domains, achieving high-precision integrated recognition remains underexplored. In this study, we propose a method that combines data augmentation through synthetic data creation with a two-stage fine-tuning approach tailored to the complexity of domain adaptation. By concatenating speech and eating sounds of varying lengths and sequences, we generated training data that mimic real-world environments where speech and eating behaviors co-exist. Additionally, efficient model adaptation was achieved through two-stage fine-tuning of the self-supervised learning model. The experimental evaluations demonstrate that the proposed method maintains speech recognition accuracy while achieving high detection performance for eating behaviors, with an F1 score of 0.918 for chewing detection and 0.926 for swallowing detection. These results underscore the potential of using voice recognition technology for daily health monitoring. Full article
(This article belongs to the Special Issue AI-Based Automated Recognition and Detection in Healthcare)
Show Figures

Figure 1

17 pages, 18059 KiB  
Article
Robust Multi-Subtype Identification of Breast Cancer Pathological Images Based on a Dual-Branch Frequency Domain Fusion Network
by Jianjun Li, Kaiyue Wang and Xiaozhe Jiang
Sensors 2025, 25(1), 240; https://doi.org/10.3390/s25010240 - 3 Jan 2025
Cited by 1 | Viewed by 856
Abstract
Breast cancer (BC) is one of the most lethal cancers worldwide, and its early diagnosis is critical for improving patient survival rates. However, the extraction of key information from complex medical images and the attainment of high-precision classification present a significant challenge. In [...] Read more.
Breast cancer (BC) is one of the most lethal cancers worldwide, and its early diagnosis is critical for improving patient survival rates. However, the extraction of key information from complex medical images and the attainment of high-precision classification present a significant challenge. In the field of signal processing, texture-rich images typically exhibit periodic patterns and structures, which are manifested as significant energy concentrations at specific frequencies in the frequency domain. Given the above considerations, this study is designed to explore the application of frequency domain analysis in BC histopathological classification. This study proposes the dual-branch adaptive frequency domain fusion network (AFFNet), designed to enable each branch to specialize in distinct frequency domain features of pathological images. Additionally, two different frequency domain approaches, namely Multi-Spectral Channel Attention (MSCA) and Fourier Filtering Enhancement Operator (FFEO), are employed to enhance the texture features of pathological images and minimize information loss. Moreover, the contributions of the two branches at different stages are dynamically adjusted by a frequency-domain-adaptive fusion strategy to accommodate the complexity and multi-scale features of pathological images. The experimental results, based on two public BC histopathological image datasets, corroborate the idea that AFFNet outperforms 10 state-of-the-art image classification methods, underscoring its effectiveness and superiority in this domain. Full article
(This article belongs to the Special Issue AI-Based Automated Recognition and Detection in Healthcare)
Show Figures

Figure 1

23 pages, 3424 KiB  
Article
Automated Detection of Gastrointestinal Diseases Using Resnet50*-Based Explainable Deep Feature Engineering Model with Endoscopy Images
by Veysel Yusuf Cambay, Prabal Datta Barua, Abdul Hafeez Baig, Sengul Dogan, Mehmet Baygin, Turker Tuncer and U. R. Acharya
Sensors 2024, 24(23), 7710; https://doi.org/10.3390/s24237710 - 2 Dec 2024
Cited by 2 | Viewed by 1431
Abstract
This work aims to develop a novel convolutional neural network (CNN) named ResNet50* to detect various gastrointestinal diseases using a new ResNet50*-based deep feature engineering model with endoscopy images. The novelty of this work is the development of ResNet50*, a new variant of [...] Read more.
This work aims to develop a novel convolutional neural network (CNN) named ResNet50* to detect various gastrointestinal diseases using a new ResNet50*-based deep feature engineering model with endoscopy images. The novelty of this work is the development of ResNet50*, a new variant of the ResNet model, featuring convolution-based residual blocks and a pooling-based attention mechanism similar to PoolFormer. Using ResNet50*, a gastrointestinal image dataset was trained, and an explainable deep feature engineering (DFE) model was developed. This DFE model comprises four primary stages: (i) feature extraction, (ii) iterative feature selection, (iii) classification using shallow classifiers, and (iv) information fusion. The DFE model is self-organizing, producing 14 different outcomes (8 classifier-specific and 6 voted) and selecting the most effective result as the final decision. During feature extraction, heatmaps are identified using gradient-weighted class activation mapping (Grad-CAM) with features derived from these regions via the final global average pooling layer of the pretrained ResNet50*. Four iterative feature selectors are employed in the feature selection stage to obtain distinct feature vectors. The classifiers k-nearest neighbors (kNN) and support vector machine (SVM) are used to produce specific outcomes. Iterative majority voting is employed in the final stage to obtain voted outcomes using the top result determined by the greedy algorithm based on classification accuracy. The presented ResNet50* was trained on an augmented version of the Kvasir dataset, and its performance was tested using Kvasir, Kvasir version 2, and wireless capsule endoscopy (WCE) curated colon disease image datasets. Our proposed ResNet50* model demonstrated a classification accuracy of more than 92% for all three datasets and a remarkable 99.13% accuracy for the WCE dataset. These findings affirm the superior classification ability of the ResNet50* model and confirm the generalizability of the developed architecture, showing consistent performance across all three distinct datasets. Full article
(This article belongs to the Special Issue AI-Based Automated Recognition and Detection in Healthcare)
Show Figures

Figure 1

16 pages, 624 KiB  
Article
Towards the Development of the Clinical Decision Support System for the Identification of Respiration Diseases via Lung Sound Classification Using 1D-CNN
by Syed Waqad Ali, Muhammad Munaf Rashid, Muhammad Uzair Yousuf, Sarmad Shams, Muhammad Asif, Muhammad Rehan and Ikram Din Ujjan
Sensors 2024, 24(21), 6887; https://doi.org/10.3390/s24216887 - 27 Oct 2024
Cited by 1 | Viewed by 1377
Abstract
Respiratory disorders are commonly regarded as complex disorders to diagnose due to their multi-factorial nature, encompassing the interplay between hereditary variables, comorbidities, environmental exposures, and therapies, among other contributing factors. This study presents a Clinical Decision Support System (CDSS) for the early detection [...] Read more.
Respiratory disorders are commonly regarded as complex disorders to diagnose due to their multi-factorial nature, encompassing the interplay between hereditary variables, comorbidities, environmental exposures, and therapies, among other contributing factors. This study presents a Clinical Decision Support System (CDSS) for the early detection of respiratory disorders using a one-dimensional convolutional neural network (1D-CNN) model. The ICBHI 2017 Breathing Sound Database, which contains samples of different breathing sounds, was used in this research. During pre-processing, audio clips were resampled to a uniform rate, and breathing cycles were segmented into individual instances of the lung sound. A One-Dimensional Convolutional Neural Network (1D-CNN) consisting of convolutional layers, max pooling layers, dropout layers, and fully connected layers, was designed to classify the processed clips into four categories: normal, crackles, wheezes, and combined crackles and wheezes. To address class imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) was applied to the training data. Hyperparameters were optimized using grid search with k−fold cross-validation. The model achieved an overall accuracy of 0.95, outperforming state-of-the-art methods. Particularly, the normal and crackles categories attained the highest F1-scores of 0.97 and 0.95, respectively. The model’s robustness was further validated through 5−fold and 10−fold cross-validation experiments. This research highlighted an essential aspect of diagnosing lung sounds through artificial intelligence and utilized the 1D-CNN to classify lung sounds accurately. The proposed advancement of technology shall enable medical care practitioners to diagnose lung disorders in an improved manner, leading to better patient care. Full article
(This article belongs to the Special Issue AI-Based Automated Recognition and Detection in Healthcare)
Show Figures

Figure 1

19 pages, 46165 KiB  
Article
A Deep-Learning-Based CPR Action Standardization Method
by Yongyuan Li, Mingjie Yin, Wenxiang Wu, Jiahuan Lu, Shangdong Liu and Yimu Ji
Sensors 2024, 24(15), 4813; https://doi.org/10.3390/s24154813 - 24 Jul 2024
Cited by 1 | Viewed by 1493
Abstract
In emergency situations, ensuring standardized cardiopulmonary resuscitation (CPR) actions is crucial. However, current automated external defibrillators (AEDs) lack methods to determine whether CPR actions are performed correctly, leading to inconsistent CPR quality. To address this issue, we introduce a novel method called deep-learning-based [...] Read more.
In emergency situations, ensuring standardized cardiopulmonary resuscitation (CPR) actions is crucial. However, current automated external defibrillators (AEDs) lack methods to determine whether CPR actions are performed correctly, leading to inconsistent CPR quality. To address this issue, we introduce a novel method called deep-learning-based CPR action standardization (DLCAS). This method involves three parts. First, it detects correct posture using OpenPose to recognize skeletal points. Second, it identifies a marker wristband with our CPR-Detection algorithm and measures compression depth, count, and frequency using a depth algorithm. Finally, we optimize the algorithm for edge devices to enhance real-time processing speed. Extensive experiments on our custom dataset have shown that the CPR-Detection algorithm achieves a mAP0.5 of 97.04%, while reducing parameters to 0.20 M and FLOPs to 132.15 K. In a complete CPR operation procedure, the depth measurement solution achieves an accuracy of 90% with a margin of error less than 1 cm, while the count and frequency measurements achieve 98% accuracy with a margin of error less than two counts. Our method meets the real-time requirements in medical scenarios, and the processing speed on edge devices has increased from 8 fps to 25 fps. Full article
(This article belongs to the Special Issue AI-Based Automated Recognition and Detection in Healthcare)
Show Figures

Figure 1

25 pages, 3406 KiB  
Article
Machine Learning Algorithms for Processing and Classifying Unsegmented Phonocardiographic Signals: An Efficient Edge Computing Solution Suitable for Wearable Devices
by Roberto De Fazio, Lorenzo Spongano, Massimo De Vittorio, Luigi Patrono and Paolo Visconti
Sensors 2024, 24(12), 3853; https://doi.org/10.3390/s24123853 - 14 Jun 2024
Cited by 1 | Viewed by 1447
Abstract
The phonocardiogram (PCG) can be used as an affordable way to monitor heart conditions. This study proposes the training and testing of several classifiers based on SVMs (support vector machines), k-NN (k-Nearest Neighbor), and NNs (neural networks) to perform binary (“Normal”/”Pathologic”) and multiclass [...] Read more.
The phonocardiogram (PCG) can be used as an affordable way to monitor heart conditions. This study proposes the training and testing of several classifiers based on SVMs (support vector machines), k-NN (k-Nearest Neighbor), and NNs (neural networks) to perform binary (“Normal”/”Pathologic”) and multiclass (“Normal”, “CAD” (coronary artery disease), “MVP” (mitral valve prolapse), and “Benign” (benign murmurs)) classification of PCG signals, without heart sound segmentation algorithms. Two datasets of 482 and 826 PCG signals from the Physionet/CinC 2016 dataset are used to train the binary and multiclass classifiers, respectively. Each PCG signal is pre-processed, with spike removal, denoising, filtering, and normalization; afterward, it is divided into 5 s frames with a 1 s shift. Subsequently, a feature set is extracted from each frame to train and test the binary and multiclass classifiers. Concerning the binary classification, the trained classifiers yielded accuracies ranging from 92.4 to 98.7% on the test set, with memory occupations from 92.7 kB to 11.1 MB. Regarding the multiclass classification, the trained classifiers achieved accuracies spanning from 95.3 to 98.6% on the test set, occupying a memory portion from 233 kB to 14.1 MB. The NNs trained and tested in this work offer the best trade-off between performance and memory occupation, whereas the trained k-NN models obtained the best performance at the cost of large memory occupation (up to 14.1 MB). The classifiers’ performance slightly depends on the signal quality, since a denoising step is performed during pre-processing. To this end, the signal-to-noise ratio (SNR) was acquired before and after the denoising, indicating an improvement between 15 and 30 dB. The trained and tested models occupy relatively little memory, enabling their implementation in resource-limited systems. Full article
(This article belongs to the Special Issue AI-Based Automated Recognition and Detection in Healthcare)
Show Figures

Figure 1

17 pages, 1689 KiB  
Article
Advancing Breast Cancer Diagnosis through Breast Mass Images, Machine Learning, and Regression Models
by Amira J. Zaylaa and Sylva Kourtian
Sensors 2024, 24(7), 2312; https://doi.org/10.3390/s24072312 - 5 Apr 2024
Cited by 2 | Viewed by 2716
Abstract
Breast cancer results from a disruption of certain cells in breast tissue that undergo uncontrolled growth and cell division. These cells most often accumulate and form a lump called a tumor, which may be benign (non-cancerous) or malignant (cancerous). Malignant tumors can spread [...] Read more.
Breast cancer results from a disruption of certain cells in breast tissue that undergo uncontrolled growth and cell division. These cells most often accumulate and form a lump called a tumor, which may be benign (non-cancerous) or malignant (cancerous). Malignant tumors can spread quickly throughout the body, forming tumors in other areas, which is called metastasis. Standard screening techniques are insufficient in the case of metastasis; therefore, new and advanced techniques based on artificial intelligence (AI), machine learning, and regression models have been introduced, the primary aim of which is to automatically diagnose breast cancer through the use of advanced techniques, classifiers, and real images. Real fine-needle aspiration (FNA) images were collected from Wisconsin, and four classifiers were used, including three machine learning models and one regression model: the support vector machine (SVM), naive Bayes (NB), k-nearest neighbors (k-NN), and decision tree (DT)-C4.5. According to the accuracy, sensitivity, and specificity results, the SVM algorithm had the best performance; it was the most powerful computational classifier with a 97.13% accuracy and 97.5% specificity. It also had around a 96% sensitivity for the diagnosis of breast cancer, unlike the models used for comparison, thereby providing an exact diagnosis on the one hand and a clear classification between benign and malignant tumors on the other hand. As a future research prospect, more algorithms and combinations of features can be considered for the precise, rapid, and effective classification and diagnosis of breast cancer images for imperative decisions. Full article
(This article belongs to the Special Issue AI-Based Automated Recognition and Detection in Healthcare)
Show Figures

Figure 1

12 pages, 1283 KiB  
Communication
Explainable Risk Prediction of Post-Stroke Adverse Mental Outcomes Using Machine Learning Techniques in a Population of 1780 Patients
by Chien Wei Oei, Eddie Yin Kwee Ng, Matthew Hok Shan Ng, Ru-San Tan, Yam Meng Chan, Lai Gwen Chan and Udyavara Rajendra Acharya
Sensors 2023, 23(18), 7946; https://doi.org/10.3390/s23187946 - 17 Sep 2023
Cited by 4 | Viewed by 2434
Abstract
Post-stroke depression and anxiety, collectively known as post-stroke adverse mental outcome (PSAMO) are common sequelae of stroke. About 30% of stroke survivors develop depression and about 20% develop anxiety. Stroke survivors with PSAMO have poorer health outcomes with higher mortality and greater functional [...] Read more.
Post-stroke depression and anxiety, collectively known as post-stroke adverse mental outcome (PSAMO) are common sequelae of stroke. About 30% of stroke survivors develop depression and about 20% develop anxiety. Stroke survivors with PSAMO have poorer health outcomes with higher mortality and greater functional disability. In this study, we aimed to develop a machine learning (ML) model to predict the risk of PSAMO. We retrospectively studied 1780 patients with stroke who were divided into PSAMO vs. no PSAMO groups based on results of validated depression and anxiety questionnaires. The features collected included demographic and sociological data, quality of life scores, stroke-related information, medical and medication history, and comorbidities. Recursive feature elimination was used to select features to input in parallel to eight ML algorithms to train and test the model. Bayesian optimization was used for hyperparameter tuning. Shapley additive explanations (SHAP), an explainable AI (XAI) method, was applied to interpret the model. The best performing ML algorithm was gradient-boosted tree, which attained 74.7% binary classification accuracy. Feature importance calculated by SHAP produced a list of ranked important features that contributed to the prediction, which were consistent with findings of prior clinical studies. Some of these factors were modifiable, and potentially amenable to intervention at early stages of stroke to reduce the incidence of PSAMO. Full article
(This article belongs to the Special Issue AI-Based Automated Recognition and Detection in Healthcare)
Show Figures

Figure 1

Review

Jump to: Research

20 pages, 2541 KiB  
Review
Non-Contact Vision-Based Techniques of Vital Sign Monitoring: Systematic Review
by Linas Saikevičius, Vidas Raudonis, Gintaras Dervinis and Virginijus Baranauskas
Sensors 2024, 24(12), 3963; https://doi.org/10.3390/s24123963 - 19 Jun 2024
Cited by 3 | Viewed by 3990
Abstract
The development of non-contact techniques for monitoring human vital signs has significant potential to improve patient care in diverse settings. By facilitating easier and more convenient monitoring, these techniques can prevent serious health issues and improve patient outcomes, especially for those unable or [...] Read more.
The development of non-contact techniques for monitoring human vital signs has significant potential to improve patient care in diverse settings. By facilitating easier and more convenient monitoring, these techniques can prevent serious health issues and improve patient outcomes, especially for those unable or unwilling to travel to traditional healthcare environments. This systematic review examines recent advancements in non-contact vital sign monitoring techniques, evaluating publicly available datasets and signal preprocessing methods. Additionally, we identified potential future research directions in this rapidly evolving field. Full article
(This article belongs to the Special Issue AI-Based Automated Recognition and Detection in Healthcare)
Show Figures

Figure 1

Back to TopTop