Pattern Recognition and Image Processing: Latest Advances and Prospects

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: 15 July 2025 | Viewed by 18972

Special Issue Editors


E-Mail Website
Guest Editor
Smart Cities Research Center, Polytechnic Institute of Tomar, Estrada do Contador, 2300-313 Tomar, Portugal
Interests: pattern recognition; image processing; machine learning

E-Mail Website
Guest Editor
Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, Morro do Lena-Alto do Vieiro, Apartado 4163, 2411-901 Leiria, Portugal
Interests: signal processing; artificial intelligence; deep learning; genetic programming
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, Morro do Lena-Alto do Vieiro, Apartado 4163, 2411-901 Leiria, Portugal
Interests: mobile computing; search-based software engineering; genetic programming; context-aware systems
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Pattern recognition and image processing are areas of significant interest whose extensive applicability ranges from areas such as medicine, biology, industrial automation, security, and remote sensing. While image processing focuses on analyzing and extracting information from images, pattern recognition aims to analyze data in different formats such as images, texts and audio to automatically detect patterns and regularities existing in the data. The technological evolution of computer systems and the search for solutions with increasingly higher levels of performance, both in terms of precision and reliability, as well as in terms of processing speed, has led to a major evolution in pattern recognition and image processing techniques. The utilization of non-learning-based methods has given way to machine learning techniques, where deep learning methods have been the focus in recent decades. Regarding the use of neural networks, there is a tendency towards proposals based on the modification of families of already established networks, the combination of different networks, the fusion of the results of networks with independent operation, the combination of neural networks with other machine learning methods, or even the application of learning transfer models.

This Special Issue intends to compile high-quality theoretical and applied research contributions in the areas of pattern recognition and image processing, with application in different domains. Potential topics include, but are not limited to, image analysis, image enhancement and reconstruction, image segmentation, classification and retrieval, image coding and compression, image interpretation and registration, motion detection and estimation, pattern recognition, and other related areas.

Dr. Sandra V.B. Jardim
Dr. Rolando Miragaia
Dr. José Carlos Bregieiro Ribeiro
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • image processing
  • pattern recognition and analysis
  • visualization
  • image coding and compression
  • image retrieval
  • image segmentation and classification
  • computational imaging
  • motion detection

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (17 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

30 pages, 52057 KiB  
Article
A Study on Correlation of Depth Fixation with Distance Between Dual Purkinje Images and Pupil Size
by Jinyeong Ahn and Eui Chul Lee
Electronics 2025, 14(9), 1799; https://doi.org/10.3390/electronics14091799 - 28 Apr 2025
Viewed by 70
Abstract
In recent times, 3D eye tracking methods have been actively studied to utilize gaze information in various applications. As a result, there is growing interest in gaze depth estimation techniques. This study introduces a monocular method for estimating gaze depth using DPI distance [...] Read more.
In recent times, 3D eye tracking methods have been actively studied to utilize gaze information in various applications. As a result, there is growing interest in gaze depth estimation techniques. This study introduces a monocular method for estimating gaze depth using DPI distance and pupil size. We acquired right eye images from eleven subjects and at ten gaze depth levels ranging from 15 cm to 60 cm at intervals of 5 cm. We used a camera equipped with an infrared LED to capture the images. We applied a contour-based algorithm to detect the first Purkinje image and pupil, then used a template matching algorithm for the fourth Purkinje image. Using the detected features, we calculated the pupil size and DPI distance. We trained a multiple linear regression model on data from eight subjects, achieving an R2 value of 0.71 and a root mean squared error (RMSE) of 7.69 cm. This result indicates an approximate 3.15% reduction in error rate compared to the general linear regression model. Based on the results, we derived the following equation: depth fixation = 20.746 × DPI distance + 5.223 × pupil size + 16.495 × (DPI distance × pupil size) + 13.880. Our experiments confirmed that gaze depth can be effectively estimated from monocular images using DPI distance and pupil size. Full article
Show Figures

Figure 1

38 pages, 6239 KiB  
Article
Computational Intelligence Approach for Fall Armyworm Control in Maize Crop
by Alex B. Bertolla and Paulo E. Cruvinel
Electronics 2025, 14(7), 1449; https://doi.org/10.3390/electronics14071449 - 3 Apr 2025
Viewed by 288
Abstract
This paper presents a method for dynamic pattern recognition and classification of one dangerous caterpillar species to allow for its control in maize crops. The use of dynamic pattern recognition supports the identification of patterns in digital image data that change over time. [...] Read more.
This paper presents a method for dynamic pattern recognition and classification of one dangerous caterpillar species to allow for its control in maize crops. The use of dynamic pattern recognition supports the identification of patterns in digital image data that change over time. In fact, identifying fall armyworms (Spodoptera frugiperda) is critical in maize production, i.e., in all of its growth stages. For such pest control, traditional agricultural practices are still dependent on human visual effort, resulting in significant losses and negative impacts on maize production, food security, and the economy. Such a developed method is based on the integration of digital image processing, multivariate statistics, and machine learning techniques. We used a supervised machine learning algorithm that classifies data by finding an optimal hyperplane that maximizes the distance between each class of caterpillar with different lengths in N-dimensional spaces. Results show the method’s efficiency, effectiveness, and suitability to support decision making for this customized control context. Full article
Show Figures

Figure 1

27 pages, 3412 KiB  
Article
Efficient Clustering Method for Graph Images Using Two-Stage Clustering Technique
by Hyuk-Gyu Park, Kwang-Seong Shin and Jong-Chan Kim
Electronics 2025, 14(6), 1232; https://doi.org/10.3390/electronics14061232 - 20 Mar 2025
Viewed by 234
Abstract
Graphimages, which represent data structures through nodes and edges, present significant challenges for clustering due to their intricate topological properties. Traditional clustering algorithms, such as K-means and Density-Based Spatial Clustering of Applications with Noise (DBSCAN), often struggle to effectively capture both spatial and [...] Read more.
Graphimages, which represent data structures through nodes and edges, present significant challenges for clustering due to their intricate topological properties. Traditional clustering algorithms, such as K-means and Density-Based Spatial Clustering of Applications with Noise (DBSCAN), often struggle to effectively capture both spatial and structural relationships within graph images. To overcome these limitations, we propose a novel two-stage clustering approach that integrates conventional clustering techniques with graph-based methodologies to enhance both accuracy and efficiency. In the first stage, a distance- or density-based clustering algorithm (e.g., K-means or DBSCAN) is applied to generate initial cluster formations. In the second stage, these clusters are refined using spectral clustering or community detection techniques to better preserve and exploit topological features. We evaluate our approach using a dataset of 8118 graph images derived from depth measurements taken at various angles. The experimental results demonstrate that our method surpasses single-method clustering approaches in terms of the silhouette score, Calinski-Harabasz index (CHI), and modularity. The silhouette score measures how similar an object is to its own cluster compared to other clusters, while the CHI, also known as the Variance Ratio Criterion, evaluates cluster quality based on the ratio of between-cluster dispersion to within-cluster dispersion. Modularity, a metric commonly used in graph-based clustering, assesses the strength of division of a network into communities. Furthermore, qualitative analysis through visualization confirms that the proposed two-stage clustering approach more effectively differentiates structural similarities within graph images. These findings underscore the potential of hybrid clustering techniques for various applications, including three-dimensional (3D) measurement analysis, medical imaging, and social network analysis. Full article
Show Figures

Figure 1

22 pages, 5641 KiB  
Article
Pose Transfer with Multi-Scale Features Combined with Latent Diffusion Model and ControlNet
by Hsu-Yung Cheng, Chia-Cheng Su, Chi-Lun Jiang and Chih-Chang Yu
Electronics 2025, 14(6), 1179; https://doi.org/10.3390/electronics14061179 - 17 Mar 2025
Viewed by 601
Abstract
In recent years, generative AI has become popular in areas like natural language processing, as well as image and audio processing, significantly expanding AI’s creative capabilities. Particularly in the realm of image generation, diffusion models have achieved remarkable success across various applications, such [...] Read more.
In recent years, generative AI has become popular in areas like natural language processing, as well as image and audio processing, significantly expanding AI’s creative capabilities. Particularly in the realm of image generation, diffusion models have achieved remarkable success across various applications, such as image synthesis and transformation. However, traditional diffusion models operate at the pixel level when learning image features, which inevitably demands significant computational resources. To address this issue, this paper proposes a pose transfer model that integrates the latent diffusion model, ControlNet, and a multi-scale feature extraction module. Moreover, the proposed method incorporates a semantic extraction filter into the attention neural network layer. This approach enables the model to train images in the latent space, subsequently focusing on critical image features and the relationships between poses. As a result, the architecture can be efficiently trained using an RTX 4090 GPU instead of multiple A100 GPUs. This study advances generative AI by optimizing diffusion models for enhanced efficiency and scalability. Our integrated approach reduces computational demands and accelerates training, making advanced image generation more accessible to organizations with limited resources and paving the way for future innovations in AI efficiency. Full article
Show Figures

Figure 1

20 pages, 42010 KiB  
Article
Coastline and Riverbed Change Detection in the Broader Area of the City of Patras Using Very High-Resolution Multi-Temporal Imagery
by Spiros Papadopoulos, Vassilis Anastassopoulos and Georgia Koukiou
Electronics 2025, 14(6), 1096; https://doi.org/10.3390/electronics14061096 - 11 Mar 2025
Viewed by 392
Abstract
Accurate and robust information on land cover changes in urban and coastal areas is essential for effective urban land management, ecosystem monitoring, and urban planning. This paper details the methodology and results of a pixel-level classification and change detection analysis, leveraging 1945 Royal [...] Read more.
Accurate and robust information on land cover changes in urban and coastal areas is essential for effective urban land management, ecosystem monitoring, and urban planning. This paper details the methodology and results of a pixel-level classification and change detection analysis, leveraging 1945 Royal Air Force (RAF) aerial imagery and 2011 Very High-Resolution (VHR) multispectral WorldView-2 satellite imagery from the broader area of Patras, Greece. Our attention is mainly focused on the changes in the coastline from the city of Patras to the northeast direction and the two major rivers, Charadros and Selemnos. The methodology involves preprocessing steps such as registration, denoising, and resolution adjustments to ensure computational feasibility for both coastal and riverbed change detection procedures while maintaining critical spatial features. For change detection at coastal areas over time, the Normalized Difference Water Index (NDWI) was applied to the new imagery to mask out the sea from the coastline and manually archive imagery from 1945. To determine the differences in the coastline between 1945 and 2011, we perform image differencing by subtracting the 1945 image from the 2011 image. This highlights the areas where changes have occurred over time. To conduct riverbed change detection, feature extraction using the Gray-Level Co-occurrence Matrix (GLCM) was applied to capture spatial characteristics. A Support Vector Machine (SVM) classification model was trained to distinguish river pixels from non-river pixels, enabling the identification of changes in riverbeds and achieving 92.6% and 92.5% accuracy for new and old imagery, respectively. Post-classification processing included classification maps to enhance the visualization of the detected changes. This approach highlights the potential of combining historical and modern imagery with supervised machine learning methods to effectively assess coastal erosion and riverbed alterations. Full article
Show Figures

Figure 1

14 pages, 590 KiB  
Article
Optimizing Multiclass Classification Using Convolutional Neural Networks with Class Weights and Early Stopping for Imbalanced Datasets
by Muhammad Nazim Razali, Nureize Arbaiy, Pei-Chun Lin and Syafikrudin Ismail
Electronics 2025, 14(4), 705; https://doi.org/10.3390/electronics14040705 - 12 Feb 2025
Viewed by 1130
Abstract
Multiclass classification in machine learning often faces significant challenges due to unbalanced datasets. This situation leads to biased predictions and reduced model performance. This research addresses this issue by proposing a novel approach that combines convolutional neural networks (CNNs) with class weights and [...] Read more.
Multiclass classification in machine learning often faces significant challenges due to unbalanced datasets. This situation leads to biased predictions and reduced model performance. This research addresses this issue by proposing a novel approach that combines convolutional neural networks (CNNs) with class weights and early-stopping techniques. The motivation behind this study stems from the need to improve model performance, especially for minority classes, which are often neglected in existing methodologies. Although various strategies such as resampling, ensemble methods, and data augmentation have been explored, they frequently have limitations based on the characteristics of the data and the specific model type. Our approach focuses on optimizing the loss function via class weights to give greater importance to minority classes. Therefore, it reduces bias and improves overall accuracy. Furthermore, we implement early stopping to avoid overfitting and improve generalization by continuously monitoring the validation performance during training. This study contributes to the body of knowledge by demonstrating the effectiveness of this combined technique in improving multiclass classification in unbalanced scenarios. The proposed model is tested for oil palm leaves analysis to identify deficiencies in nitrogen (N), boron (B), magnesium (Mg), and potassium (K). The CNN model with three layers and a SoftMax activation function was trained for 200 epochs each. The analysis compared three scenarios: training with the imbalanced dataset, training with class weights, and training with class weights and early stopping. The results showed that applying class weights significantly improved the classification accuracy, with a trade-off in other class predictions. This indicates that, while class weight has a positive overall impact, further strategies are necessary to improve model performance across all categories in this study. Full article
Show Figures

Figure 1

21 pages, 9588 KiB  
Article
Feasibility Study on Contactless Feature Analysis for Early Drowsiness Detection in Driving Scenarios
by Yebin Choi, Sihyeon Yang, Yoojin Park, Choin Choi and Eui Chul Lee
Electronics 2025, 14(4), 662; https://doi.org/10.3390/electronics14040662 - 8 Feb 2025
Viewed by 513
Abstract
Drowsy driving significantly impairs drivers’ attention and reaction times, increasing the risk of accidents. Developing effective prevention technologies is therefore a critical task. Previous studies have highlighted several limitations: (1) Most drowsiness-detection methods rely solely on facial features such as eye blinking or [...] Read more.
Drowsy driving significantly impairs drivers’ attention and reaction times, increasing the risk of accidents. Developing effective prevention technologies is therefore a critical task. Previous studies have highlighted several limitations: (1) Most drowsiness-detection methods rely solely on facial features such as eye blinking or yawning, limiting their ability to detect different drowsiness levels. (2) Sensor-based methods utilizing wearable devices may interfere with driving activities. (3) Binary classification of drowsiness levels is insufficient for accident prevention, as it fails to detect early signs of drowsiness. This study proposes a novel drowsiness-detection method that classifies drowsiness into three levels (alert, low vigilant, drowsy) using a non-contact, camera-based approach that integrates physiological signals and visible facial features. Conducted as a feasibility study, it evaluates the potential applicability of this method in driving situations. To evaluate generalizability, experiments were conducted with seen-subject and unseen-subject conditions, achieving accuracies of 96.7% and 75.7%, respectively. This approach provides a more comprehensive and practical solution to drowsiness detection, contributing to safer driving environments. Full article
Show Figures

Figure 1

17 pages, 3623 KiB  
Article
Deep Learning-Based Approach for Microscopic Algae Classification with Grad-CAM Interpretability
by Maisam Ali, Muhammad Yaseen, Sikandar Ali and Hee-Cheol Kim
Electronics 2025, 14(3), 442; https://doi.org/10.3390/electronics14030442 - 22 Jan 2025
Viewed by 1218
Abstract
The natural occurrence of harmful algal blooms (HABs) adversely affects the quality of clean and fresh water. They pose increased risks to human health, aquatic ecosystems, and water bodies. Continuous monitoring and appropriate measures must be taken to combat HABs. Deep learning models [...] Read more.
The natural occurrence of harmful algal blooms (HABs) adversely affects the quality of clean and fresh water. They pose increased risks to human health, aquatic ecosystems, and water bodies. Continuous monitoring and appropriate measures must be taken to combat HABs. Deep learning models that utilize computer vision play a vital role in identifying and classifying harmful algal blooms in aquatic environments and water storage facilities. Inspecting algal blooms using conventional methods, such as algae detection under microscopes, is difficult, expensive, and time-consuming. Deep learning algorithms have shown a notable and remarkable performance in the image classification domain and its applications, including microscopic algae species classification and detection. In this study, we propose a deep learning-based approach for classifying microscopic images of algae using computer vision. This approach employs a convolutional neural network (CNN) model integrated with two additional blocks—squeeze and dense blocks—to determine the presence of algae, followed by adding Grad-CAM to the proposed model to ensure interpretability and transparency. We performed several experiments on our custom dataset of microscopic algae images. Data augmentation techniques were employed to increase the number of images in the dataset, whereas pre-processing techniques were implemented to elevate the overall data quality. Our proposed model was trained on 3200 images consisting of four classes. We also compared our proposed model with the other transfer learning models, i.e., ResNet50 and Vgg16. Our proposed model outperformed the other two deep learning models. The proposed model demonstrated 96.7% accuracy, while Resnet50, EfficientNetB0, and VGG16 showed accuracy of 85.0%, 92.96%, and 93.5%, respectively. The results of this research demonstrate the potential of deep learning-based approaches for algae classification. This deep learning-based algorithm can be deployed in real-time applications to classify and identify algae to ensure the quality of water reservoirs. Computer-assisted solutions are advantageous for tracking freshwater algal blooms. Using deep learning-based models to identify and classify algae species from microscopic images is a novel application in the AI community. Full article
Show Figures

Figure 1

26 pages, 23657 KiB  
Article
A Digital Twin Approach for Soil Moisture Measurement with Physically Based Rendering Simulations and Machine Learning
by Ismail Parewai and Mario Köppen
Electronics 2025, 14(2), 395; https://doi.org/10.3390/electronics14020395 - 20 Jan 2025
Cited by 1 | Viewed by 934
Abstract
Soil is one of the most important factors of agricultural productivity, directly influencing crop growth, water management, and overall yield. However, inefficient soil moisture monitoring methods, such as manual observation and gravimetric in rural areas, often lead to overwatering or underwatering, wasting resources [...] Read more.
Soil is one of the most important factors of agricultural productivity, directly influencing crop growth, water management, and overall yield. However, inefficient soil moisture monitoring methods, such as manual observation and gravimetric in rural areas, often lead to overwatering or underwatering, wasting resources and reduced yields, and harming soil health. This study offers a digital twin approach for soil moisture measurement, integrating real-time physical data, virtual simulations, and machine learning to classify soil moisture conditions. The digital twin is proposed as a virtual representation of physical soil designed to replicate real-world behavior. We used a multispectral rotocam, and high-resolution soil images were captured under controlled conditions. Physically based rendering (PBR) materials were created from these data and implemented in a game engine to simulate soil properties accurately. Image processing techniques were applied to extract key features, followed by machine learning algorithms to classify soil moisture levels (wet, normal, dry). Our results demonstrate that the Soil Digital Twin replicates real-world behavior, with the Random Forest model achieving a high classification accuracy of 96.66% compared to actual soil. This data-driven approach conveys the potential of the Soil Digital Twin to enhance precision farming initiatives and water use efficiency for sustainable agriculture. Full article
Show Figures

Figure 1

17 pages, 3635 KiB  
Article
Automatic Segmentation in 3D CT Images: A Comparative Study of Deep Learning Architectures for the Automatic Segmentation of the Abdominal Aorta
by Christos Mavridis, Theodoros P. Vagenas, Theodore L. Economopoulos, Ioannis Vezakis, Ourania Petropoulou, Ioannis Kakkos and George K. Matsopoulos
Electronics 2024, 13(24), 4919; https://doi.org/10.3390/electronics13244919 - 13 Dec 2024
Cited by 2 | Viewed by 1281
Abstract
Abdominal aortic aneurysm (AAA) is a complex vascular condition associated with high mortality rates. Accurate abdominal aorta segmentation is essential in medical imaging, facilitating diagnosis and treatment for a range of cardiovascular diseases. In this regard, deep learning-based automated segmentation has shown significant [...] Read more.
Abdominal aortic aneurysm (AAA) is a complex vascular condition associated with high mortality rates. Accurate abdominal aorta segmentation is essential in medical imaging, facilitating diagnosis and treatment for a range of cardiovascular diseases. In this regard, deep learning-based automated segmentation has shown significant promise in the precise delineation of the aorta. However, comparisons across different models remain limited, with most studies performing algorithmic training and testing on the same dataset. Furthermore, due to the variability in AAA presentation, using healthy controls for deep learning AAA segmentation poses a significant challenge. This study provides a detailed comparative analysis of four deep learning architectures—UNet, SegResNet, UNet Transformers (UNETR), and Shifted-Windows UNet Transformers (SwinUNETR)—for full abdominal aorta segmentation. The models were evaluated both qualitatively and quantitatively using private and public 3D (Computed Tomography) CT datasets. Moreover, they were successful in attaining high performance in delineating AAA aorta, while being trained on healthy aortic imaging data. Our findings indicate that the UNet architecture achieved the highest segmentation accuracy among the models tested. Full article
Show Figures

Figure 1

29 pages, 4029 KiB  
Article
Compact DINO-ViT: Feature Reduction for Visual Transformer
by Didih Rizki Chandranegara, Przemysław Niedziela and Bogusław Cyganek
Electronics 2024, 13(23), 4694; https://doi.org/10.3390/electronics13234694 - 27 Nov 2024
Cited by 1 | Viewed by 1124
Abstract
Research has been ongoing for years to discover image features that enable their best classification. One of the latest developments in this area is the Self-Distillation with No Labels Vision Transformer—DINO-ViT features. However, even for a single image, their volume is significant. Therefore, [...] Read more.
Research has been ongoing for years to discover image features that enable their best classification. One of the latest developments in this area is the Self-Distillation with No Labels Vision Transformer—DINO-ViT features. However, even for a single image, their volume is significant. Therefore, for this article we proposed to substantially reduce their size, using two methods: Principal Component Analysis and Neighborhood Component Analysis. Our developed methods, PCA-DINO and NCA-DINO, showed a significant reduction in the volume of the features, often exceeding an order of magnitude while maintaining or slightly reducing the classification accuracy, which was confirmed by numerous experiments. Additionally, we evaluated the Uniform Manifold Approximation and Projection (UMAP) method, showing the superiority of the PCA and NCA approaches. Our experiments involving modifications to patch size, attention heads, and noise insertion in DINO-ViT demonstrated that both PCA-DINO and NCA-DINO exhibited reliable accuracy. While NCA-DINO is optimal for high-performance applications despite its higher computational cost, PCA-DINO offers a faster, more resource-efficient solution, depending on the application-specific requirements. The code for our method is available on GitHub. Full article
Show Figures

Figure 1

14 pages, 1028 KiB  
Article
Person Identification Using Temporal Analysis of Facial Blood Flow
by Maria Raia, Thomas Stogiannopoulos, Nikolaos Mitianoudis and Nikolaos V. Boulgouris
Electronics 2024, 13(22), 4499; https://doi.org/10.3390/electronics13224499 - 15 Nov 2024
Viewed by 837
Abstract
Biometrics play an important role in modern access control and security systems. The need of novel biometrics to complement traditional biometrics has been at the forefront of research. The Facial Blood Flow (FBF) biometric trait, recently proposed by our team, is a spatio-temporal [...] Read more.
Biometrics play an important role in modern access control and security systems. The need of novel biometrics to complement traditional biometrics has been at the forefront of research. The Facial Blood Flow (FBF) biometric trait, recently proposed by our team, is a spatio-temporal representation of facial blood flow, constructed using motion magnification from facial areas where skin is visible. Due to its design and construction, the FBF does not need information from the eyes, nose, or mouth, and, therefore, it yields a versatile biometric of great potential. In this work, we evaluate the effectiveness of novel temporal partitioning and Fast Fourier Transform-based features that capture the temporal evolution of facial blood flow. These new features, along with a “time-distributed” Convolutional Neural Network-based deep learning architecture, are experimentally shown to increase the performance of FBF-based person identification compared to our previous efforts. This study provides further evidence of FBF’s potential for use in biometric identification. Full article
Show Figures

Figure 1

14 pages, 928 KiB  
Article
Online Action Detection Incorporating an Additional Action Classifier
by Min-Hang Hsu, Chen-Chien Hsu, Yin-Tien Wang, Shao-Kang Huang and Yi-Hsing Chien
Electronics 2024, 13(20), 4110; https://doi.org/10.3390/electronics13204110 - 18 Oct 2024
Viewed by 915
Abstract
Most online action detection methods focus on solving a (K + 1) classification problem, where the additional category represents the ‘background’ class. However, training on the ‘background’ class and managing data imbalance are common challenges in online action detection. To address these [...] Read more.
Most online action detection methods focus on solving a (K + 1) classification problem, where the additional category represents the ‘background’ class. However, training on the ‘background’ class and managing data imbalance are common challenges in online action detection. To address these issues, we propose a framework for online action detection by incorporating an additional pathway between the feature extractor and online action detection model. Specifically, we present one configuration that retains feature distinctions for fusion with the final decision from the Long Short-Term Transformer (LSTR), enhancing its performance in the (K + 1) classification. Experimental results show that the proposed method achieves an accuracy of 71.2% in mean Average Precision (mAP) on the Thumos14 dataset, outperforming the 69.5% achieved by the original LSTR method. Full article
Show Figures

Figure 1

19 pages, 7421 KiB  
Article
Utilizing Convolutional Neural Networks for the Effective Classification of Rice Leaf Diseases Through a Deep Learning Approach
by Salma Akter, Rashadul Islam Sumon, Haider Ali and Hee-Cheol Kim
Electronics 2024, 13(20), 4095; https://doi.org/10.3390/electronics13204095 - 17 Oct 2024
Cited by 3 | Viewed by 2409
Abstract
Rice is the primary staple food in many Asian countries, and ensuring the quality of rice crops is vital for food security. Effective crop management depends on the early and precise detection of common rice diseases such as bacterial blight, blast, brown spot, [...] Read more.
Rice is the primary staple food in many Asian countries, and ensuring the quality of rice crops is vital for food security. Effective crop management depends on the early and precise detection of common rice diseases such as bacterial blight, blast, brown spot, and tungro. This work presents a convolutional neural network model for classifying rice leaf disease. Four distinct diseases, bacterial blight, blast, brown spot, and tungro, are the main targets of the model. Previously, leaf pathologies in crops were mostly identified manually using specialized equipment, which was time-consuming and inefficient. This study offers a remedy for accurately diagnosing and classifying rice leaf diseases through deep learning techniques. Using this dataset, the proposed CNN model was trained to identify complex patterns and attributes linked to each disease using its deep learning capabilities. This CNN model achieved an exceptional accuracy of 99.99%, surpassing the benchmarks set by existing state-of-the-art models. The proposed model can be a useful diagnostic and early warning system for rice leaf diseases. It could help farmers and other agricultural professionals reduce crop losses and enhance the quality of their yields. Full article
Show Figures

Figure 1

20 pages, 17355 KiB  
Article
Low-Light Image Enhancement Network Using Informative Feature Stretch and Attention
by Sung Min Chun, Jun Young Park and Il Kyu Eom
Electronics 2024, 13(19), 3883; https://doi.org/10.3390/electronics13193883 - 30 Sep 2024
Viewed by 1670
Abstract
Low-light images often exhibit reduced brightness, weak contrast, and color distortion. Consequently, enhancing low-light images is essential to make them suitable for computer vision tasks. Nevertheless, addressing this task is particularly challenging because of the inherent constraints posed by low-light environments. In this [...] Read more.
Low-light images often exhibit reduced brightness, weak contrast, and color distortion. Consequently, enhancing low-light images is essential to make them suitable for computer vision tasks. Nevertheless, addressing this task is particularly challenging because of the inherent constraints posed by low-light environments. In this study, we propose a novel low-light image enhancement network using adaptive feature stretching and informative attention. The proposed network architecture mainly includes an adaptive feature stretch block designed to extend the narrow range of image features to a broader range. To achieve improved image restoration, an informative attention block is introduced to assign weight to the output features from the adaptive feature stretch block. We conduct comprehensive experiments on widely used benchmark datasets to assess the effectiveness of the proposed network. The experimental results show that the proposed low-light image enhancement network yields satisfactory results compared with existing state-of-the-art methods from both subjective and objective perspectives while maintaining acceptable network complexity. Full article
Show Figures

Figure 1

15 pages, 2700 KiB  
Article
Study on the Generation and Comparative Analysis of Ethnically Diverse Faces for Developing a Multiracial Face Recognition Model
by Yeongje Park, Junho Baek, Seunghyun Kim, Seung-Min Jeong, Hyunsoo Seo and Eui Chul Lee
Electronics 2024, 13(18), 3627; https://doi.org/10.3390/electronics13183627 - 12 Sep 2024
Viewed by 1457
Abstract
Despite major breakthroughs in facial recognition technology, problems with bias and a lack of diversity still plague face recognition systems today. To address these issues, we created synthetic face data using a diffusion-based generative model and fine-tuned already-high-performing models. To achieve a more [...] Read more.
Despite major breakthroughs in facial recognition technology, problems with bias and a lack of diversity still plague face recognition systems today. To address these issues, we created synthetic face data using a diffusion-based generative model and fine-tuned already-high-performing models. To achieve a more balanced overall performance across various races, the synthetic dataset was created by following the dual-condition face generator (DCFace) resolution and using race-varied data from BUPT-BalancedFace as well as FairFace. To verify the proposed method, we fine-tuned a pre-trained improved residual networks (IResnet)-100 model with additive angular margin (ArcFace) loss using the synthetic dataset. The results show that the racial gap in performance is reduced from 0.0107 to 0.0098 in standard deviation terms, while the overall accuracy increases from 96.125% to 96.1625%. The improved racial balance and diversity in the synthetic dataset led to an improvement in model fairness, demonstrating that this resource could facilitate more equitable face recognition systems. This method provides a low-cost way to address data diversity challenges and help make face recognition more accurate across different demographic groups. The results of the study highlighted that more advanced synthesized datasets, created through diffusion-based models, can also result in increased facial recognition accuracy with greater fairness, emphasizing that these should not be ignored by developers aiming to create artificial intelligence (AI) systems. Full article
Show Figures

Figure 1

28 pages, 4253 KiB  
Article
Real-Time Personal Protective Equipment Non-Compliance Recognition on AI Edge Cameras
by Pubudu Sanjeewani, Glenn Neuber, John Fitzgerald, Nadeesha Chandrasena, Stijn Potums, Azadeh Alavi and Christopher Lane
Electronics 2024, 13(15), 2990; https://doi.org/10.3390/electronics13152990 - 29 Jul 2024
Cited by 4 | Viewed by 2812
Abstract
Despite advancements in technology, safety equipment, and training within the construction industry over recent decades, the prevalence of fatal and nonfatal injuries and accidents remains a significant concern among construction workers. Hard hats and safety vests are crucial safety gear known to mitigate [...] Read more.
Despite advancements in technology, safety equipment, and training within the construction industry over recent decades, the prevalence of fatal and nonfatal injuries and accidents remains a significant concern among construction workers. Hard hats and safety vests are crucial safety gear known to mitigate severe head trauma and other injuries. However, adherence to safety protocols, including the use of such gear, is often inadequate, posing potential risks to workers. Moreover, current manual safety monitoring systems are laborious and time-consuming. To address these challenges and enhance workplace safety, there is a pressing need to automate safety monitoring processes economically, with reduced processing times. This research proposes a deep learning-based pipeline for real-time identification of non-compliance with wearing hard hats and safety vests, enabling safety officers to preempt hazards and mitigate risks at construction sites. We evaluate various neural networks for edge deployment and find that the Single Shot Multibox Detector (SSD) MobileNet V2 model excels in computational efficiency, making it particularly suitable for this application-oriented task. The experiments and comparative analyses demonstrate the pipeline’s effectiveness in accurately identifying instances of non-compliance across different scenarios, underscoring its potential for improving safety outcomes. Full article
Show Figures

Figure 1

Back to TopTop