Recent Advances and Applications of Machine Learning in Pattern Recognition

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: 15 June 2026 | Viewed by 16560

Special Issue Editors


E-Mail Website
Guest Editor
Coimbra Institute of Engineering, Polytechnic University of Coimbra, 3045-093 Coimbra, Portugal
Interests: pattern recognition; machine learning; image processing; biomedical applications

E-Mail Website
Guest Editor
Applied Research Institute, Polytechnic of Coimbra, 3045-093 Coimbra, Portugal
Interests: artificial intelligence; bioinformatics; computational biology; pattern recognition; machine learning; multi-objective optimization algorithms

Special Issue Information

Dear Colleagues,

In recent years, pattern recognition has undergone remarkable development, driven above all by the use of machine learning techniques. These algorithms have been applied to various areas, such as medical image analysis, visual recognition, biometrics, remote sensing, communication, and even computer vision in autonomous vehicles, enabling the development of pattern recognition systems that have greater precision and efficiency. Recent approaches allow for the processing and analysis of large volumes of data and also promote the finding of innovative solutions to complex problems.

In this Special Issue, we invite researchers, academics, and practitioners to submit original research articles, reviews, and case studies that explore the recent advances in and applications of machine learning in pattern recognition.

Dr. Verónica Vasconcelos
Dr. Maryam Abbasi
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • pattern recognition
  • machine learning
  • deep learning
  • image processing
  • image segmentation
  • image detection
  • image classification
  • biometrics
  • computer vision

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (14 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 1838 KB  
Article
A Deep Learning Model for Wave V Peak Detection in Auditory Brainstem Response Data
by Jun Ma, Nak-Jun Sung, Sungjun Choi, Min Hong and Sungyeup Kim
Electronics 2026, 15(3), 511; https://doi.org/10.3390/electronics15030511 - 25 Jan 2026
Viewed by 244
Abstract
In this study, we propose a YOLO-based object detection algorithm for the automated and accurate identification of the fifth wave (Wave V) in auditory brainstem response (ABR) graphs. The ABR test plays a critical role in the diagnosis of hearing disorders, with the [...] Read more.
In this study, we propose a YOLO-based object detection algorithm for the automated and accurate identification of the fifth wave (Wave V) in auditory brainstem response (ABR) graphs. The ABR test plays a critical role in the diagnosis of hearing disorders, with the fifth wave serving as a key marker for clinical assessment. However, conventional manual detection is time-consuming and subject to variability depending on the examiner’s expertise. To address these limitations, we developed a real-time detection method that utilizes a YOLO object detection model applied to ABR graph images. Prior to YOLO training, we employed a U-Net-based preprocessing algorithm to automatically remove existing annotated peaks from the ABR images, thereby generating training data suitable for peak detection. The proposed model was evaluated in terms of precision, recall, and mean average precision (mAP). The experimental results demonstrate that the YOLO-based approach achieves high detection performance across these metrics, indicating its potential as an effective tool for reliable Wave V peak localization in audiological applications. Full article
Show Figures

Figure 1

17 pages, 1423 KB  
Article
Residual Motion Correction in Low-Dose Myocardial CT Perfusion Using CNN-Based Deformable Registration
by Mahmud Hasan, Aaron So and Mahmoud R. El-Sakka
Electronics 2026, 15(2), 450; https://doi.org/10.3390/electronics15020450 - 20 Jan 2026
Viewed by 270
Abstract
Dynamic myocardial CT perfusion imaging enables functional assessment of coronary artery stenosis and myocardial microvascular disease. However, it is susceptible to residual motion artifacts arising from cardiac and respiratory activity. These artifacts introduce temporal misalignments, distorting Time-Enhancement Curves (TECs) and leading to inaccurate [...] Read more.
Dynamic myocardial CT perfusion imaging enables functional assessment of coronary artery stenosis and myocardial microvascular disease. However, it is susceptible to residual motion artifacts arising from cardiac and respiratory activity. These artifacts introduce temporal misalignments, distorting Time-Enhancement Curves (TECs) and leading to inaccurate myocardial perfusion measurements. Traditional nonrigid registration methods can address such motion but are often computationally expensive and less effective when applied to low-dose images, which are prone to increased noise and structural degradation. In this work, we present a CNN-based motion-correction framework specifically trained for low-dose cardiac CT perfusion imaging. The model leverages spatiotemporal patterns to estimate and correct residual motion between time frames, aligning anatomical structures while preserving dynamic contrast behaviour. Unlike conventional methods, our approach avoids iterative optimization and manually defined similarity metrics, enabling faster, more robust corrections. Quantitative evaluation demonstrates significant improvements in temporal alignment, with reduced Target Registration Error (TRE) and increased correlation between voxel-wise TECs and reference curves. These enhancements enable more accurate myocardial perfusion measurements. Noise from low-dose scans affects registration performance, but this remains an open challenge. This work emphasizes the potential of learning-based methods to perform effective residual motion correction under challenging acquisition conditions, thereby improving the reliability of myocardial perfusion assessment. Full article
Show Figures

Figure 1

28 pages, 2342 KB  
Article
Machine Learning-Based Blood Pressure Prediction Using Cardiovascular Disease Data: A Comprehensive Comparative Study
by Irina Naskinova, Mikhail Kolev, Dilyana Karova and Mariyan Milev
Electronics 2026, 15(2), 312; https://doi.org/10.3390/electronics15020312 - 10 Jan 2026
Viewed by 549
Abstract
Hypertension remains one of the most pressing public health challenges worldwide, affecting more than one billion individuals and serving as a principal risk factor for cardiovascular morbidity and mortality. Whilst blood pressure measurement constitutes a routine component of clinical practice, the capacity to [...] Read more.
Hypertension remains one of the most pressing public health challenges worldwide, affecting more than one billion individuals and serving as a principal risk factor for cardiovascular morbidity and mortality. Whilst blood pressure measurement constitutes a routine component of clinical practice, the capacity to predict blood pressure values from readily obtainable patient characteristics could substantially enhance preventive care strategies and facilitate timely intervention. The present study examines whether machine learning methodologies can reliably forecast blood pressure measurements utilizing cardiovascular risk factors in conjunction with demographic and anthropometric data. We have analyzed data from 68,616 individuals following rigorous quality assessment of 70,000 patient records obtained from Kaggle’s cardiovascular disease repository. Beyond the 10 original variables, we engineered additional features encompassing demographic patterns, body composition indices, clinical risk indicators, and their interactions. Nine distinct predictive models were systematically evaluated, spanning from elementary baseline approaches through to sophisticated gradient boosting ensembles. CatBoost demonstrated superior performance, yielding systolic blood pressure predictions with a root mean squared error (RMSE) of 14.37 mmHg and coefficient of determination (R2) of 0.265, alongside diastolic blood pressure predictions with RMSE of 8.57 mmHg and R2 of 0.187. These modest explained variance values—substantially below unity—reveal a fundamental limitation: blood pressure proves remarkably resistant to prediction from the demographic, anthropometric, and clinical variables typically available in epidemiological datasets. These findings illuminate a sobering reality regarding blood pressure prediction from routinely collected clinical data. The observation that standard variables account for merely one-quarter of blood pressure variance should temper expectations for machine learning applications within this domain, whilst simultaneously underscoring the necessity for richer data sources or novel biomarkers to achieve clinically meaningful predictive accuracy. Full article
Show Figures

Figure 1

20 pages, 1176 KB  
Article
DnCNN-Based Denoising Model for Low-Dose Myocardial CT Perfusion Imaging
by Mahmud Hasan, Aaron So and Mahmoud R. El-Sakka
Electronics 2026, 15(1), 124; https://doi.org/10.3390/electronics15010124 - 26 Dec 2025
Cited by 1 | Viewed by 370
Abstract
Unlike high-dose scans, low-dose cardiac CT perfusion imaging reduces patient radiation exposure and thereby the risk of potential health effects. However, it introduces significant image noise, degrading diagnostic quality and limiting clinical assessment. Denoising is thus a critical preprocessing step to enhance image [...] Read more.
Unlike high-dose scans, low-dose cardiac CT perfusion imaging reduces patient radiation exposure and thereby the risk of potential health effects. However, it introduces significant image noise, degrading diagnostic quality and limiting clinical assessment. Denoising is thus a critical preprocessing step to enhance image quality without compromising anatomical or perfusion details. Traditionally used reconstruction-domain methods, such as Iterative Reconstruction and Compressed Sensing, are often limited by algorithmic complexity, dependence on raw sinogram data, and restricted adaptability. Conversely, image-domain methods offer more adaptable denoising options. Recently, learning-based approaches have further expanded this flexibility and demonstrated state-of-the-art performance across various denoising tasks. In this work, we present a deep learning-based denoising method specifically tuned for low-dose cardiac CT perfusion imaging. Our model is trained to reduce noise while preserving structural integrity and temporal contrast dynamics, which are critical for downstream analysis. Unlike many existing methods, our approach is optimized for perfusion data, where temporal consistency is essential. Residual cardiac motion remains a separate challenge, which we aim to address in our future work. Experimental results show significant improvements in quantitative image quality, using both reference-based and no-reference metrics, such as MSE/PSNR/SSIM and NIQE/FID/KID, as well as improved accuracy of perfusion measurements. Full article
Show Figures

Figure 1

25 pages, 3370 KB  
Article
A SimAM-Enhanced Multi-Resolution CNN with BiGRU for EEG Emotion Recognition: 4D-MRSimNet
by Yutao Huang and Jijie Deng
Electronics 2026, 15(1), 39; https://doi.org/10.3390/electronics15010039 - 22 Dec 2025
Viewed by 328
Abstract
This study proposes 4D-MRSimNet, a framework that employs attention mechanisms to focus on distinct dimensions. The approach applies enhancements to key responses in the spatial and spectral domains and provides a characterization of dynamic evolution in temporal domain, which extracts and integrates complementary [...] Read more.
This study proposes 4D-MRSimNet, a framework that employs attention mechanisms to focus on distinct dimensions. The approach applies enhancements to key responses in the spatial and spectral domains and provides a characterization of dynamic evolution in temporal domain, which extracts and integrates complementary emotional features to facilitate final classification. At the feature level, differential entropy (DE) and power spectral density (PSD) are combined within four core frequency bands (θ, α, β, and γ). These bands are recognized as closely related to emotional processing. This integration constructs a complementary feature representation that preserves both energy distribution and entropy variability. These features are organized into a 4D representation that integrates electrode topology, frequency characteristics, and temporal dependencies inherent in EEG signals. At the network level, a multi-resolution convolutional module embedded with SimAM attention extracts spatial and spectral features at different scales and adaptively emphasizes key information. A bidirectional GRU (BiGRU) integrated with temporal attention further emphasizes critical time segments and strengthens the modeling of temporal dependencies. Experiments show that our method achieves an accuracy of 97.68% for valence and 97.61% for arousal on the DEAP dataset and 99.60% for valence and 99.46% for arousal on the DREAMER dataset. The results demonstrate the effectiveness of complementary feature fusion, multidimensional feature representation, and the complementary dual attention enhancement strategy for EEG emotion recognition. Full article
Show Figures

Figure 1

17 pages, 1639 KB  
Article
Context-Aware Tourism Recommendations Using Retrieval-Augmented Large Language Models and Semantic Re-Ranking
by Ratomir Karlović, Mia Rovis, Alma Smajić, Luka Sever and Ivan Lorencin
Electronics 2025, 14(22), 4448; https://doi.org/10.3390/electronics14224448 - 14 Nov 2025
Viewed by 1240
Abstract
This study evaluates the performance of seven large language models (LLMs) in generating context-aware recommendations. The system is built on a collection of PDF documents (brochures) describing local events and activities, which are embedded into an FAISS vector store to support semantic retrieval. [...] Read more.
This study evaluates the performance of seven large language models (LLMs) in generating context-aware recommendations. The system is built on a collection of PDF documents (brochures) describing local events and activities, which are embedded into an FAISS vector store to support semantic retrieval. Synthetic user profiles are defined to simulate diverse preferences, while static weather conditions are incorporated to enhance the contextual relevance of recommendations. To further improve output quality, a reranking step, utilizing Cohere’s API, is used to refine the top retrieved results before passing them to the LLMs for final response generation. This allows better semantic organization of relevant content in line with user context. The main aim of this research is to identify which models best integrate multimodal inputs, such as user intent, profile attributes, environmental context and how these insights can inform the development of adaptive, personalized recommendation systems. The main contribution of this study is a structured comparative analysis of 7 LLMs, applied to a tourism-specific RAG framework, providing practical insights into how effectively different models integrate contextual factors to produce personalized recommendations. The evaluation revealed notable differences in model performance, with Qwen and Phi emerging as the strongest performers, whereas LLaMA frequently produced irrelevant recommendations. Moreover, many models favored gastronomy-related venues over other types of attractions. These findings indicate that although the RAG framework provides a solid foundation, the selection of underlying models plays an important role in achieving high quality recommendations. Full article
Show Figures

Figure 1

18 pages, 3213 KB  
Article
Automating Code Recognition for Cargo Containers
by José Santos, Daniel Canedo and António J. R. Neves
Electronics 2025, 14(22), 4437; https://doi.org/10.3390/electronics14224437 - 14 Nov 2025
Cited by 1 | Viewed by 776
Abstract
Maritime transport plays a pivotal role in global trade, where efficiency and accuracy in port operations are crucial. Among the various tasks carried out in ports, container code recognition is essential for tracking and handling cargo. Manual inspections of container codes are becoming [...] Read more.
Maritime transport plays a pivotal role in global trade, where efficiency and accuracy in port operations are crucial. Among the various tasks carried out in ports, container code recognition is essential for tracking and handling cargo. Manual inspections of container codes are becoming increasingly impractical, as they induce delays and raise the risk of human error. To address these issues, this work proposes a hybrid Optical Character Recognition system that integrates YOLOv7 for text detection with the transformer-based TrOCR for recognition of the container codes, enabling accurate and efficient automated recognition. This design addresses the real-world challenges, such as varying light, distortions, and multi-orientation of container codes. To evaluate the system, we conducted a comprehensive evaluation on datasets that simulate the conditions found in port environments. The results demonstrate that the proposed hybrid model delivers significant improvements in detection and recognition accuracy and robustness compared to traditional OCR methods. In particular, the reliability in recognizing multi-oriented codes marks a notable advancement compared to existing solutions. Overall, this study presents an approach to automating container code recognition, contributing to the efficiency and modernization of port operations, with the potential to streamline port operations, reduce human error, and enhance the overall logistics workflow. Full article
Show Figures

Figure 1

17 pages, 11184 KB  
Article
Automated Crack Detection in Micro-CT Scanning for Fiber-Reinforced Concrete Using Super-Resolution and Deep Learning
by João Pedro Gomes de Souza, Aristófanes Corrêa Silva, Marcello Congro, Deane Roehl, Anselmo Cardoso de Paiva, Sandra Pereira and António Cunha
Electronics 2025, 14(21), 4208; https://doi.org/10.3390/electronics14214208 - 28 Oct 2025
Cited by 1 | Viewed by 1018
Abstract
Fiber-reinforced concrete is a crucial material for civil construction, and monitoring its health is important for preserving structures and preventing accidents and financial losses. Among non-destructive monitoring methods, Micro Computed Tomography (Micro-CT) imaging stands out as an inexpensive method that is free from [...] Read more.
Fiber-reinforced concrete is a crucial material for civil construction, and monitoring its health is important for preserving structures and preventing accidents and financial losses. Among non-destructive monitoring methods, Micro Computed Tomography (Micro-CT) imaging stands out as an inexpensive method that is free from noise and external interference. However, manual inspection of these images is subjective and requires significant human effort. In recent years, several studies have successfully utilized Deep Learning models for the automatic detection of cracks in concrete. However, according to the literature, a gap remains in the context of detecting cracks using Micro-CT images of fiber-reinforced concrete. Therefore, this work proposes a framework for automatic crack detection that combines the following: (a) a super-resolution-based preprocessing to generate, for each image, versions with double and quadruple the original resolution, (b) a classification step using EfficientNetB0 to classify the type of concrete matrix, (c) specific training of Detection Transformer (DETR) models for each type of matrix and resolution, and (d) and a votation committee-based post-processing among the models trained for each resolution to reduce false positives. The model was trained on a new publicly available dataset, the FIRECON dataset, which consists of 4064 images annotated by an expert, achieving metrics of 86.098% Intersection over Union, 89.37% Precision, 83.26% Recall, 84.99% F1-Score, and 44.69% Average Precision. The framework, therefore, significantly reduces analysis time and improves consistency compared to the manual methods used in previous studies. The results demonstrate the potential of Deep Learning to aid image analysis in damage assessments, providing valuable insights into the damage mechanisms of fiber-reinforced concrete and contributing to the development of durable, high-performance engineering materials. Full article
Show Figures

Figure 1

18 pages, 43842 KB  
Article
DPO-ESRGAN: Perceptually Enhanced Super-Resolution Using Direct Preference Optimization
by Wonwoo Yun and Hanhoon Park
Electronics 2025, 14(17), 3357; https://doi.org/10.3390/electronics14173357 - 23 Aug 2025
Viewed by 2123
Abstract
Super-resolution (SR) is a long-standing task in the field of computer vision that aims to improve the quality and resolution of an image. ESRGAN is a representative generative adversarial network specialized to produce perceptually convincing SR images. However, it often fails to recover [...] Read more.
Super-resolution (SR) is a long-standing task in the field of computer vision that aims to improve the quality and resolution of an image. ESRGAN is a representative generative adversarial network specialized to produce perceptually convincing SR images. However, it often fails to recover local details and still produces blurry or unnatural visual artifacts, resulting in producing SR images that people do not prefer. To address this problem, we propose to adopt Direct Preference Optimization (DPO), which was originally devised to fine-tune large language models based on human preferences. To this end, we develop a method for applying DPO to ESRGAN, and add a DPO loss for training the ESRGAN generator. Through ×4 SR experiments utilizing benchmark datasets, it is demonstrated that the proposed method can produce SR images with a significantly higher perceptual quality and higher human preference than ESRGAN and other ESRGAN variants that have modified the loss or network structure of ESRGAN. Specifically, when compared to ESRGAN, the proposed method achieved, on average, 0.32 lower PieAPP values, 0.79 lower NIQE values, and 0.05 higher PSNR values on the BSD100 dataset, as well as 0.32 lower PieAPP values, 0.32 lower NIQE values, and 0.17 higher PSNR values on the Set14 dataset. Full article
Show Figures

Figure 1

30 pages, 4741 KB  
Article
TriViT-Lite: A Compact Vision Transformer–MobileNet Model with Texture-Aware Attention for Real-Time Facial Emotion Recognition in Healthcare
by Waqar Riaz, Jiancheng (Charles) Ji and Asif Ullah
Electronics 2025, 14(16), 3256; https://doi.org/10.3390/electronics14163256 - 16 Aug 2025
Cited by 2 | Viewed by 1059
Abstract
Facial emotion recognition has become increasingly important in healthcare, where understanding delicate cues like pain, discomfort, or unconsciousness can support more timely and responsive care. Yet, recognizing facial expressions in real-world settings remains challenging due to varying lighting, facial occlusions, and hardware limitations [...] Read more.
Facial emotion recognition has become increasingly important in healthcare, where understanding delicate cues like pain, discomfort, or unconsciousness can support more timely and responsive care. Yet, recognizing facial expressions in real-world settings remains challenging due to varying lighting, facial occlusions, and hardware limitations in clinical environments. To address this, we propose TriViT-Lite, a lightweight yet powerful model that blends three complementary components: MobileNet, for capturing fine-grained local features efficiently; Vision Transformers (ViT), for modeling global facial patterns; and handcrafted texture descriptors, such as Local Binary Patterns (LBP) and Histograms of Oriented Gradients (HOG), for added robustness. These multi-scale features are brought together through a texture-aware cross-attention fusion mechanism that helps the model focus on the most relevant facial regions dynamically. TriViT-Lite is evaluated on both benchmark datasets (FER2013, AffectNet) and a custom healthcare-oriented dataset covering seven critical emotional states, including pain and unconsciousness. It achieves a competitive accuracy of 91.8% on FER2013 and of 87.5% on the custom dataset while maintaining real-time performance (~15 FPS) on resource-constrained edge devices. Our results show that TriViT-Lite offers a practical and accurate solution for real-time emotion recognition, particularly in healthcare settings. It strikes a balance between performance, interpretability, and efficiency, making it a strong candidate for machine-learning-driven pattern recognition in patient-monitoring applications. Full article
Show Figures

Figure 1

25 pages, 8862 KB  
Article
Building a Self-Explanatory Social Robot on the Basis of an Explanation-Oriented Runtime Knowledge Model
by José Galeas, Alberto Tudela, Óscar Pons, Suna Bensch, Thomas Hellström and Antonio Bandera
Electronics 2025, 14(16), 3178; https://doi.org/10.3390/electronics14163178 - 10 Aug 2025
Cited by 1 | Viewed by 1350
Abstract
In recent years, there has been growing interest in developing robots capable of explaining their behavior, thereby improving their acceptance by humans with whom they share their environment. Proposed software designs are typically based on the advances being made in conversational systems built [...] Read more.
In recent years, there has been growing interest in developing robots capable of explaining their behavior, thereby improving their acceptance by humans with whom they share their environment. Proposed software designs are typically based on the advances being made in conversational systems built on deep learning techniques. However, apart from the ability to formulate explanations, the robot also needs an internal episodic memory, where it stores information from the continuous stream of experiences. Most previous proposals are designed to deal with short streams of episodic data (several minutes long). With the aim of managing larger experiences, we propose in this work a high-level episodic memory, where relevant events are abstracted to natural language concepts. The proposed framework is intimately linked to a software architecture in which the explanations, whether externalized or not, are shaped internally in a collaborative process involving the task-oriented software agents that make up the architecture. The core of this process is a runtime knowledge model, employed as working memory whose evolution allows for capturing the causal events stored in the episodic memory. We present several use cases that illustrate how the suggested framework allows an autonomous robot to generate correct and relevant explanations of its actions and behavior. Full article
Show Figures

Figure 1

14 pages, 1712 KB  
Article
Machine Learning-Based Predictive Model for Risk Stratification of Multiple Myeloma from Monoclonal Gammopathy of Undetermined Significance
by Amparo Santamaría, Marcos Alfaro, Cristina Antón, Beatriz Sánchez-Quiñones, Nataly Ibarra, Arturo Gil, Oscar Reinoso and Luis Payá
Electronics 2025, 14(15), 3014; https://doi.org/10.3390/electronics14153014 - 29 Jul 2025
Cited by 3 | Viewed by 1212
Abstract
Monoclonal Gammopathy of Undetermined Significance (MGUS) is a precursor to hematologic malignancies such as Multiple Myeloma (MM) and Waldenström Macroglobulinemia (WM). Accurate risk stratification of MGUS patients remains a clinical and computational challenge, with existing models often misclassifying both high-risk and low-risk individuals, [...] Read more.
Monoclonal Gammopathy of Undetermined Significance (MGUS) is a precursor to hematologic malignancies such as Multiple Myeloma (MM) and Waldenström Macroglobulinemia (WM). Accurate risk stratification of MGUS patients remains a clinical and computational challenge, with existing models often misclassifying both high-risk and low-risk individuals, leading to inefficient healthcare resource allocation. This study presents a machine learning (ML)-based approach for early prediction of MM/WM progression, using routinely collected hematological data, which are selected based on clinical relevance. A retrospective cohort of 292 MGUS patients, including 7 who progressed to malignancy, was analyzed. For each patient, a feature descriptor was constructed incorporating the latest biomarker values, their temporal trends over the previous year, age, and immunoglobulin subtype. To address the inherent class imbalance, data augmentation techniques were applied. Multiple ML classifiers were evaluated, with the Support Vector Machine (SVM) achieving the highest performance (94.3% accuracy and F1-score). The model demonstrates that a compact set of clinically relevant features can yield robust predictive performance. These findings highlight the potential of ML-driven decision-support systems in electronic health applications, offering a scalable solution for improving MGUS risk stratification, optimizing clinical workflows, and enabling earlier interventions. Full article
Show Figures

Graphical abstract

24 pages, 9767 KB  
Article
Improved Binary Classification of Underwater Images Using a Modified ResNet-18 Model
by Mehrunnisa, Mikolaj Leszczuk, Dawid Juszka and Yi Zhang
Electronics 2025, 14(15), 2954; https://doi.org/10.3390/electronics14152954 - 24 Jul 2025
Cited by 3 | Viewed by 2462
Abstract
In recent years, the classification of underwater images has become one of the most remarkable areas of research in computer vision due to its useful applications in marine sciences, aquatic robotics, and sea exploration. Underwater imaging is pivotal for the evaluation of marine [...] Read more.
In recent years, the classification of underwater images has become one of the most remarkable areas of research in computer vision due to its useful applications in marine sciences, aquatic robotics, and sea exploration. Underwater imaging is pivotal for the evaluation of marine eco-systems, analysis of biological habitats, and monitoring underwater infrastructure. Extracting useful information from underwater images is highly challenging due to factors such as light distortion, scattering, poor contrast, and complex foreground patterns. These difficulties make traditional image processing and machine learning techniques struggle to analyze images accurately. As a result, these challenges and complexities make the classification difficult or poor to perform. Recently, deep learning techniques, especially convolutional neural network (CNN), have emerged as influential tools for underwater image classification, contributing noteworthy improvements in accuracy and performance in the presence of all these challenges. In this paper, we have proposed a modified ResNet-18 model for the binary classification of underwater images into raw and enhanced images. In the proposed modified ResNet-18 model, we have added new layers such as Linear, rectified linear unit (ReLU) and dropout layers, arranged in a block that was repeated three times to enhance feature extraction and improve learning. This enables our model to learn the complex patterns present in the image in more detail, which helps the model to perform the classification very well. Due to these newly added layers, our proposed model addresses various complexities such as noise, distortion, varying illumination conditions, and complex patterns by learning vigorous features from underwater image datasets. To handle the issue of class imbalance present in the dataset, we applied a data augmentation technique. The proposed model achieved outstanding performance, with 96% accuracy, 99% precision, 92% sensitivity, 99% specificity, 95% F1-score, and a 96% Area under the Receiver Operating Characteristic Curve (AUC-ROC) score. These results demonstrate the strength and reliability of our proposed model in handling the challenges posed by the underwater imagery and making it a favorable solution for advancing underwater image classification tasks. Full article
Show Figures

Figure 1

16 pages, 7057 KB  
Article
VRBiom: A New Periocular Dataset for Biometric Applications of Head-Mounted Display
by Ketan Kotwal, Ibrahim Ulucan, Gökhan Özbulak, Janani Selliah and Sébastien Marcel
Electronics 2025, 14(9), 1835; https://doi.org/10.3390/electronics14091835 - 30 Apr 2025
Cited by 1 | Viewed by 2035
Abstract
With advancements in hardware, high-quality head-mounted display (HMD) devices are being developed by numerous companies, driving increased consumer interest in AR, VR, and MR applications. This proliferation of HMD devices opens up possibilities for a wide range of applications beyond entertainment. Most commercially [...] Read more.
With advancements in hardware, high-quality head-mounted display (HMD) devices are being developed by numerous companies, driving increased consumer interest in AR, VR, and MR applications. This proliferation of HMD devices opens up possibilities for a wide range of applications beyond entertainment. Most commercially available HMD devices are equipped with internal inward-facing cameras to record the periocular areas. Given the nature of these devices and captured data, many applications such as biometric authentication and gaze analysis become feasible. To effectively explore the potential of HMDs for these diverse use-cases and to enhance the corresponding techniques, it is essential to have an HMD dataset that captures realistic scenarios. In this work, we present a new dataset of periocular videos acquired using a virtual reality headset called VRBiom. The VRBiom, targeted at biometric applications, consists of 900 short videos acquired from 25 individuals recorded in the NIR spectrum. These 10 s long videos have been captured using the internal tracking cameras of Meta Quest Pro at 72 FPS. To encompass real-world variations, the dataset includes recordings under three gaze conditions: steady, moving, and partially closed eyes. We have also ensured an equal split of recordings without and with glasses to facilitate the analysis of eye-wear. These videos, characterized by non-frontal views of the eye and relatively low spatial resolutions (400×400), can be instrumental in advancing state-of-the-art research across various biometric applications. The VRBiom dataset can be utilized to evaluate, train, or adapt models for biometric use-cases such as iris and/or periocular recognition and associated sub-tasks such as detection and semantic segmentation. In addition to data from real individuals, we have included around 1100 presentation attacks constructed from 92 PA instruments. These PAIs fall into six categories constructed through combinations of print attacks (real and synthetic identities), fake 3D eyeballs, plastic eyes, and various types of masks and mannequins. These PA videos, combined with genuine (bona fide) data, can be utilized to address concerns related to spoofing, which is a significant threat if these devices are to be used for authentication. The VRBiom dataset is publicly available for research purposes related to biometric applications only. Full article
Show Figures

Figure 1

Back to TopTop