MDPI - Publisher of Open Access Journals

16 pages, 3953 KiB

Open AccessArticle

Skin Lesion Classification Using Hybrid Feature Extraction Based on Classical and Deep Learning Methods

by Maryem Zahid, Mohammed Rziza and Rachid Alaoui

BioMedInformatics 2025, 5(3), 41; https://doi.org/10.3390/biomedinformatics5030041 - 16 Jul 2025

Viewed by 247

This paper proposes a hybrid method for skin lesion classification combining deep learning features with conventional descriptors such as HOG, Gabor, SIFT, and LBP. Feature extraction was performed by extracting features of interest within the tumor area using suggested fusion methods. We tested [...] Read more.

This paper proposes a hybrid method for skin lesion classification combining deep learning features with conventional descriptors such as HOG, Gabor, SIFT, and LBP. Feature extraction was performed by extracting features of interest within the tumor area using suggested fusion methods. We tested and compared features obtained from different deep learning models coupled to HOG-based features. Dimensionality reduction and performance improvement were achieved by Principal Component Analysis, after which SVM was used for classification. The compared methods were tested on the reference database skin cancer-malignant-vs-benign. The results show a significant improvement in terms of accuracy due to complementarity between the conventional and deep learning-based methods. Specifically, the addition of HOG descriptors led to an accuracy increase of 5% for EfficientNetB0, 7% for ResNet50, 5% for ResNet101, 1% for NASNetMobile, 1% for DenseNet201, and 1% for MobileNetV2. These findings confirm that feature fusion significantly enhances performance compared to the individual application of each method. Full article

► Show Figures

Figure 1

20 pages, 7366 KiB

Open AccessArticle

Histogram of Polarization Gradient for Target Tracking in Infrared DoFP Polarization Thermal Imaging

by Jianguo Yang, Dian Sheng, Weiqi Jin and Li Li

Remote Sens. 2025, 17(5), 907; https://doi.org/10.3390/rs17050907 - 4 Mar 2025

Viewed by 678

Abstract

Division-of-focal-plane (DoFP) polarization imaging systems have demonstrated considerable promise in target detection and tracking in complex backgrounds. However, existing methods face challenges, including dependence on complex image preprocessing procedures and limited real-time performance. To address these issues, this study presents a novel histogram [...] Read more.

Division-of-focal-plane (DoFP) polarization imaging systems have demonstrated considerable promise in target detection and tracking in complex backgrounds. However, existing methods face challenges, including dependence on complex image preprocessing procedures and limited real-time performance. To address these issues, this study presents a novel histogram of polarization gradient (HPG) feature descriptor that enables efficient feature representation of polarization mosaic images. First, a polarization distance calculation model based on normalized cross-correlation (NCC) and local variance is constructed, which enhances the robustness of gradient feature extraction through dynamic weight adjustment. Second, a sparse Laplacian filter is introduced to achieve refined gradient feature representation. Subsequently, adaptive polarization channel correlation weights and the second-order gradient are utilized to reconstruct the degree of linear polarization (DoLP). Finally, the gradient and DoLP sign information are ingeniously integrated to enhance the capability of directional expression, thus providing a new theoretical perspective for polarization mosaic image structure analysis. The experimental results obtained using a self-developed long-wave infrared DoFP polarization thermal imaging system demonstrate that, within the same FBACF tracking framework, the proposed HPG feature descriptor significantly outperforms traditional grayscale {8.22%, 2.93%}, histogram of oriented gradient (HOG) {5.86%, 2.41%}, and mosaic gradient histogram (MGH) {27.19%, 18.11%} feature descriptors in terms of precision and success rate. The processing speed of approximately 20 fps meets the requirements for real-time tracking applications, providing a novel technical solution for polarization imaging applications. Full article

(This article belongs to the Special Issue Recent Advances in Infrared Target Detection)

► Show Figures

Figure 1

17 pages, 1978 KiB

Open AccessArticle

Lightweight Deepfake Detection Based on Multi-Feature Fusion

by Siddiqui Muhammad Yasir and Hyun Kim

Appl. Sci. 2025, 15(4), 1954; https://doi.org/10.3390/app15041954 - 13 Feb 2025

Cited by 3 | Viewed by 2599

Abstract

Deepfake technology utilizes deep learning (DL)-based face manipulation techniques to seamlessly replace faces in videos, creating highly realistic but artificially generated content. Although this technology has beneficial applications in media and entertainment, misuse of its capabilities may lead to serious risks, including identity [...] Read more.

Deepfake technology utilizes deep learning (DL)-based face manipulation techniques to seamlessly replace faces in videos, creating highly realistic but artificially generated content. Although this technology has beneficial applications in media and entertainment, misuse of its capabilities may lead to serious risks, including identity theft, cyberbullying, and false information. The integration of DL with visual cognition has resulted in important technological improvements, particularly in addressing privacy risks caused by artificially generated “deepfake” images on digital media platforms. In this study, we propose an efficient and lightweight method for detecting deepfake images and videos, making it suitable for devices with limited computational resources. In order to reduce the computational burden usually associated with DL models, our method integrates machine learning classifiers in combination with keyframing approaches and texture analysis. Moreover, the features extracted with a histogram of oriented gradients (HOG), local binary pattern (LBP), and KAZE bands were integrated to evaluate using random forest, extreme gradient boosting, extra trees, and support vector classifier algorithms. Our findings show a feature-level fusion of HOG, LBP, and KAZE features improves accuracy to 92% and 96% on FaceForensics++ and Celeb-DF(v2), respectively. Full article

(This article belongs to the Collection Trends and Prospects in Multimedia)

► Show Figures

Figure 1

24 pages, 3877 KiB

Open AccessArticle

A Hybrid Approach for Sports Activity Recognition Using Key Body Descriptors and Hybrid Deep Learning Classifier

by Muhammad Tayyab, Sulaiman Abdullah Alateyah, Mohammed Alnusayri, Mohammed Alatiyyah, Dina Abdulaziz AlHammadi, Ahmad Jalal and Hui Liu

Sensors 2025, 25(2), 441; https://doi.org/10.3390/s25020441 - 13 Jan 2025

Cited by 8 | Viewed by 1151

Abstract

This paper presents an approach for event recognition in sequential images using human body part features and their surrounding context. Key body points were approximated to track and monitor their presence in complex scenarios. Various feature descriptors, including MSER (Maximally Stable Extremal Regions), [...] Read more.

This paper presents an approach for event recognition in sequential images using human body part features and their surrounding context. Key body points were approximated to track and monitor their presence in complex scenarios. Various feature descriptors, including MSER (Maximally Stable Extremal Regions), SURF (Speeded-Up Robust Features), distance transform, and DOF (Degrees of Freedom), were applied to skeleton points, while BRIEF (Binary Robust Independent Elementary Features), HOG (Histogram of Oriented Gradients), FAST (Features from Accelerated Segment Test), and Optical Flow were used on silhouettes or full-body points to capture both geometric and motion-based features. Feature fusion was employed to enhance the discriminative power of the extracted data and the physical parameters calculated by different feature extraction techniques. The system utilized a hybrid CNN (Convolutional Neural Network) + RNN (Recurrent Neural Network) classifier for event recognition, with Grey Wolf Optimization (GWO) for feature selection. Experimental results showed significant accuracy, achieving 98.5% on the UCF-101 dataset and 99.2% on the YouTube dataset. Compared to state-of-the-art methods, our approach achieved better performance in event recognition. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

20 pages, 7090 KiB

Open AccessArticle

An Infrared and Visible Image Alignment Method Based on Gradient Distribution Properties and Scale-Invariant Features in Electric Power Scenes

by Lin Zhu, Yuxing Mao, Chunxu Chen and Lanjia Ning

J. Imaging 2025, 11(1), 23; https://doi.org/10.3390/jimaging11010023 - 13 Jan 2025

Viewed by 1084

Abstract

In grid intelligent inspection systems, automatic registration of infrared and visible light images in power scenes is a crucial research technology. Since there are obvious differences in key attributes between visible and infrared images, direct alignment is often difficult to achieve the expected [...] Read more.

In grid intelligent inspection systems, automatic registration of infrared and visible light images in power scenes is a crucial research technology. Since there are obvious differences in key attributes between visible and infrared images, direct alignment is often difficult to achieve the expected results. To overcome the high difficulty of aligning infrared and visible light images, an image alignment method is proposed in this paper. First, we use the Sobel operator to extract the edge information of the image pair. Second, the feature points in the edges are recognised by a curvature scale space (CSS) corner detector. Third, the Histogram of Orientation Gradients (HOG) is extracted as the gradient distribution characteristics of the feature points, which are normalised with the Scale Invariant Feature Transform (SIFT) algorithm to form feature descriptors. Finally, initial matching and accurate matching are achieved by the improved fast approximate nearest-neighbour matching method and adaptive thresholding, respectively. Experiments show that this method can robustly match the feature points of image pairs under rotation, scale, and viewpoint differences, and achieves excellent matching results. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)

► Show Figures

Graphical abstract

22 pages, 3158 KiB

Open AccessArticle

Sensitivity Analysis of Traffic Sign Recognition to Image Alteration and Training Data Size

by Arthur Rubio, Guillaume Demoor, Simon Chalmé, Nicolas Sutton-Charani and Baptiste Magnier

Information 2024, 15(10), 621; https://doi.org/10.3390/info15100621 - 10 Oct 2024

Viewed by 2114

Abstract

Accurately classifying road signs is crucial for autonomous driving due to the high stakes involved in ensuring safety and compliance. As Convolutional Neural Networks (CNNs) have largely replaced traditional Machine Learning models in this domain, the demand for substantial training data has increased. [...] Read more.

Accurately classifying road signs is crucial for autonomous driving due to the high stakes involved in ensuring safety and compliance. As Convolutional Neural Networks (CNNs) have largely replaced traditional Machine Learning models in this domain, the demand for substantial training data has increased. This study aims to compare the performance of classical Machine Learning (ML) models and Deep Learning (DL) models under varying amounts of training data, particularly focusing on altered signs to mimic real-world conditions. We evaluated three classical models: Support Vector Machine (SVM), Random Forest, and Linear Discriminant Analysis (LDA), and one Deep Learning model: Convolutional Neural Network (CNN). Using the German Traffic Sign Recognition Benchmark (GTSRB) dataset, which includes approximately 40,000 German traffic signs, we introduced digital alterations to simulate conditions such as environmental wear or vandalism. Additionally, the Histogram of Oriented Gradients (HOG) descriptor was used to assist classical models. Bayesian optimization and k-fold cross-validation were employed for model fine-tuning and performance assessment. Our findings reveal a threshold in training data beyond which accuracy plateaus. Classical models showed a linear performance decrease under increasing alteration, while CNNs, despite being more robust to alterations, did not significantly outperform classical models in overall accuracy. Ultimately, classical Machine Learning models demonstrated performance comparable to CNNs under certain conditions, suggesting that effective road sign classification can be achieved with less computationally intensive approaches. Full article

(This article belongs to the Special Issue Machine Learning and Artificial Intelligence with Applications)

► Show Figures

Graphical abstract

25 pages, 5122 KiB

Open AccessArticle

Human Emotion Recognition Based on Spatio-Temporal Facial Features Using HOG-HOF and VGG-LSTM

by Hajar Chouhayebi, Mohamed Adnane Mahraz, Jamal Riffi, Hamid Tairi and Nawal Alioua

Computers 2024, 13(4), 101; https://doi.org/10.3390/computers13040101 - 16 Apr 2024

Cited by 3 | Viewed by 2819

Abstract

Human emotion recognition is crucial in various technological domains, reflecting our growing reliance on technology. Facial expressions play a vital role in conveying and preserving human emotions. While deep learning has been successful in recognizing emotions in video sequences, it struggles to effectively [...] Read more.

Human emotion recognition is crucial in various technological domains, reflecting our growing reliance on technology. Facial expressions play a vital role in conveying and preserving human emotions. While deep learning has been successful in recognizing emotions in video sequences, it struggles to effectively model spatio-temporal interactions and identify salient features, limiting its accuracy. This research paper proposed an innovative algorithm for facial expression recognition which combined a deep learning algorithm and dynamic texture methods. In the initial phase of this study, facial features were extracted using the Visual-Geometry-Group (VGG19) model and input into Long-Short-Term-Memory (LSTM) cells to capture spatio-temporal information. Additionally, the HOG-HOF descriptor was utilized to extract dynamic features from video sequences, capturing changes in facial appearance over time. Combining these models using the Multimodal-Compact-Bilinear (MCB) model resulted in an effective descriptor vector. This vector was then classified using a Support Vector Machine (SVM) classifier, chosen for its simpler interpretability compared to deep learning models. This choice facilitates better understanding of the decision-making process behind emotion classification. In the experimental phase, the fusion method outperformed existing state-of-the-art methods on the eNTERFACE05 database, with an improvement margin of approximately 1%. In summary, the proposed approach exhibited superior accuracy and robust detection capabilities. Full article

► Show Figures

Figure 1

20 pages, 5360 KiB

Open AccessArticle

An Appearance-Semantic Descriptor with Coarse-to-Fine Matching for Robust VPR

by Jie Chen, Wenbo Li, Pengshuai Hou, Zipeng Yang and Haoyu Zhao

Sensors 2024, 24(7), 2203; https://doi.org/10.3390/s24072203 - 29 Mar 2024

Cited by 1 | Viewed by 1240

Abstract

In recent years, semantic segmentation has made significant progress in visual place recognition (VPR) by using semantic information that is relatively invariant to appearance and viewpoint, demonstrating great potential. However, in some extreme scenarios, there may be semantic occlusion and semantic sparsity, which [...] Read more.

In recent years, semantic segmentation has made significant progress in visual place recognition (VPR) by using semantic information that is relatively invariant to appearance and viewpoint, demonstrating great potential. However, in some extreme scenarios, there may be semantic occlusion and semantic sparsity, which can lead to confusion when relying solely on semantic information for localization. Therefore, this paper proposes a novel VPR framework that employs a coarse-to-fine image matching strategy, combining semantic and appearance information to improve algorithm performance. First, we construct SemLook global descriptors using semantic contours, which can preliminarily screen images to enhance the accuracy and real-time performance of the algorithm. Based on this, we introduce SemLook local descriptors for fine screening, combining robust appearance information extracted by deep learning with semantic information. These local descriptors can address issues such as semantic overlap and sparsity in urban environments, further improving the accuracy of the algorithm. Through this refined screening process, we can effectively handle the challenges of complex image matching in urban environments and obtain more accurate results. The performance of SemLook descriptors is evaluated on three public datasets (Extended-CMU Season, Robot-Car Seasons v2, and SYNTHIA) and compared with six state-of-the-art VPR algorithms (HOG, CoHOG, AlexNet_VPR, Region VLAD, Patch-NetVLAD, Forest). In the experimental comparison, considering both real-time performance and evaluation metrics, the SemLook descriptors are found to outperform the other six algorithms. Evaluation metrics include the area under the curve (AUC) based on the precision–recall curve, Recall@100%Precision, and Precision@100%Recall. On the Extended-CMU Season dataset, SemLook descriptors achieve a 100% AUC value, and on the SYNTHIA dataset, they achieve a 99% AUC value, demonstrating outstanding performance. The experimental results indicate that introducing global descriptors for initial screening and utilizing local descriptors combining both semantic and appearance information for precise matching can effectively address the issue of location recognition in scenarios with semantic ambiguity or sparsity. This algorithm enhances descriptor performance, making it more accurate and robust in scenes with variations in appearance and viewpoint. Full article

(This article belongs to the Special Issue Simultaneous Localization and Mapping (SLAM) and Artificial Intelligence (AI) Based Localization for Positioning Applications and Mobile Robot Navigation—Second Edition)

► Show Figures

Figure 1

23 pages, 19390 KiB

Open AccessArticle

Semi-Symmetrical, Fully Convolutional Masked Autoencoder for TBM Muck Image Segmentation

by Ke Lei, Zhongsheng Tan, Xiuying Wang and Zhenliang Zhou

Symmetry 2024, 16(2), 222; https://doi.org/10.3390/sym16020222 - 12 Feb 2024

Cited by 10 | Viewed by 1909

Abstract

Deep neural networks are effectively utilized for the instance segmentation of muck images from tunnel boring machines (TBMs), providing real-time insights into the surrounding rock condition. However, the high cost of obtaining quality labeled data limits the widespread application of this method. Addressing [...] Read more.

Deep neural networks are effectively utilized for the instance segmentation of muck images from tunnel boring machines (TBMs), providing real-time insights into the surrounding rock condition. However, the high cost of obtaining quality labeled data limits the widespread application of this method. Addressing this challenge, this study presents a semi-symmetrical, fully convolutional masked autoencoder designed for self-supervised pre-training on extensive unlabeled muck image datasets. The model features a four-tier sparse encoder for down-sampling and a two-tier sparse decoder for up-sampling, connected via a conventional convolutional neck, forming a semi-symmetrical structure. This design enhances the model’s ability to capture essential low-level features, including geometric shapes and object boundaries. Additionally, to circumvent the trivial solutions in pixel regression that the original masked autoencoder faced, Histogram of Oriented Gradients (HOG) descriptors and Laplacian features have been integrated as novel self-supervision targets. Testing shows that the proposed model can effectively discern essential features of muck images in self-supervised training. When applied to subsequent end-to-end training tasks, it enhances the model’s performance, increasing the prediction accuracy of Intersection over Union (IoU) for muck boundaries and regions by 5.9% and 2.4%, respectively, outperforming the enhancements made by the original masked autoencoder. Full article

(This article belongs to the Special Issue Symmetry Applied in Computer Vision, Automation, and Robotics)

► Show Figures

Figure 1

14 pages, 3035 KiB

Open AccessArticle

Integrated Analysis of Machine Learning and Deep Learning in Silkworm Pupae (Bombyx mori) Species and Sex Identification

by Haibo He, Shiping Zhu, Lunfu Shen, Xuening Chang, Yichen Wang, Di Zeng, Benhua Xiong, Fangyin Dai and Tianfu Zhao

Animals 2023, 13(23), 3612; https://doi.org/10.3390/ani13233612 - 22 Nov 2023

Cited by 8 | Viewed by 2480

Abstract

Hybrid pairing of the corresponding silkworm species is a pivotal link in sericulture, ensuring egg quality and directly influencing silk quantity and quality. Considering the potential of image recognition and the impact of varying pupal postures, this study used machine learning and deep [...] Read more.

Hybrid pairing of the corresponding silkworm species is a pivotal link in sericulture, ensuring egg quality and directly influencing silk quantity and quality. Considering the potential of image recognition and the impact of varying pupal postures, this study used machine learning and deep learning for global modeling to identify pupae species and sex separately or simultaneously. The performance of traditional feature-based approaches, deep learning feature-based approaches, and their fusion approaches were compared. First, 3600 images of the back, abdomen, and side postures of 5 species of male and female pupae were captured. Next, six traditional descriptors, including the histogram of oriented gradients (HOG), and six deep learning descriptors, including ConvNeXt-S, were utilized to extract significant species and sex features. Finally, classification models were constructed using the multilayer perceptron (MLP), support vector machine, and random forest. The results indicate that the {HOG + ConvNeXt-S + MLP} model excelled, achieving 99.09% accuracy for separate species and sex recognition and 98.40% for simultaneous recognition, with precision–recall and receiver operating characteristic curves ranging from 0.984 to 1.0 and 0.996 to 1.0, respectively. In conclusion, it can capture subtle distinctions between pupal species and sexes and shows promise for extensive application in sericulture. Full article

(This article belongs to the Special Issue Intelligent Animal Husbandry)

► Show Figures

Figure 1

15 pages, 4273 KiB

Open AccessArticle

Fusion of Attention-Based Convolution Neural Network and HOG Features for Static Sign Language Recognition

by Diksha Kumari and Radhey Shyam Anand

Appl. Sci. 2023, 13(21), 11993; https://doi.org/10.3390/app132111993 - 3 Nov 2023

Cited by 8 | Viewed by 2846

Abstract

The deaf and hearing-impaired community expresses their emotions, communicates with society, and enhances the interaction between humans and computers using sign language gestures. This work presents a strategy for efficient feature extraction that uses a combination of two different methods that are the [...] Read more.

The deaf and hearing-impaired community expresses their emotions, communicates with society, and enhances the interaction between humans and computers using sign language gestures. This work presents a strategy for efficient feature extraction that uses a combination of two different methods that are the convolutional block attention module (CBAM)-based convolutional neural network (CNN) and standard handcrafted histogram of oriented gradients (HOG) feature descriptor. The proposed framework aims to enhance accuracy by extracting meaningful features and resolving issues like rotation, similar hand orientation, etc. The HOG feature extraction technique provides a compact feature representation that signifies meaningful information about sign gestures. The CBAM attention module is incorporated into the structure of CNN to enhance feature learning using spatial and channel attention mechanisms. Then, the final feature vector is formed by concatenating these features. This feature vector is provided to the classification layers to predict static sign gestures. The proposed approach is validated on two publicly available static Massey American Sign Language (ASL) and Indian Sign Language (ISL) databases. The model’s performance is evaluated using precision, recall, F1-score, and accuracy. Our proposed methodology achieved 99.22% and 99.79% accuracy for the ASL and ISL datasets. The acquired results signify the efficiency of the feature fusion and attention mechanism. Our network performed better in accuracy compared to the earlier studies. Full article

(This article belongs to the Special Issue Research on Image Analysis and Computer Vision)

► Show Figures

Figure 1

36 pages, 44840 KiB

Open AccessArticle

LPHOG: A Line Feature and Point Feature Combined Rotation Invariant Method for Heterologous Image Registration

by Jianmeng He, Xin Jiang, Zhicheng Hao, Ming Zhu, Wen Gao and Shi Liu

Remote Sens. 2023, 15(18), 4548; https://doi.org/10.3390/rs15184548 - 15 Sep 2023

Cited by 5 | Viewed by 1733

Abstract

Remote sensing image registration has been a very important research topic, especially the registration of heterologous images. In the research of the past few years, numerous registration algorithms for heterogenic images have been developed, especially feature-based matching algorithms, such as point feature-based or [...] Read more.

Remote sensing image registration has been a very important research topic, especially the registration of heterologous images. In the research of the past few years, numerous registration algorithms for heterogenic images have been developed, especially feature-based matching algorithms, such as point feature-based or line feature-based matching methods. However, there are few matching algorithms that combine line and point features. Therefore, this study proposes a matching algorithm that combines line features and point features while achieving good rotation invariance. It comprises LSD detection of line features, keypoint extraction, and HOG-like feature descriptor construction. The matching performance is compared with state-of-the-art matching algorithms on three heterogeneous image datasets (optical–SAR dataset, optical–infrared dataset, and optical–optical dataset), verifying our method’s rotational invariance by rotating images in each dataset. Finally, the experimental results show that our algorithm outperforms the state-of-the-art algorithms in terms of matching performance while possessing very good rotation invariance. Full article

► Show Figures

Figure 1

21 pages, 6058 KiB

Open AccessArticle

A Fine-Tuned Hybrid Stacked CNN to Improve Bengali Handwritten Digit Recognition

by Ruhul Amin, Md. Shamim Reza, Yuichi Okuyama, Yoichi Tomioka and Jungpil Shin

Electronics 2023, 12(15), 3337; https://doi.org/10.3390/electronics12153337 - 4 Aug 2023

Cited by 5 | Viewed by 2245

Abstract

Recognition of Bengali handwritten digits has several unique challenges, including the variation in writing styles, the different shapes and sizes of digits, the varying levels of noise, and the distortion in the images. Despite significant improvements, there is still room for further improvement [...] Read more.

Recognition of Bengali handwritten digits has several unique challenges, including the variation in writing styles, the different shapes and sizes of digits, the varying levels of noise, and the distortion in the images. Despite significant improvements, there is still room for further improvement in the recognition rate. By building datasets and developing models, researchers can advance state-of-the-art support, which can have important implications for various domains. In this paper, we introduce a new dataset of 5440 handwritten Bengali digit images acquired from a Bangladeshi University that is now publicly available. Both conventional machine learning and CNN models were used to evaluate the task. To begin, we scrutinized the results of the ML model used after integrating three image feature descriptors, namely Binary Pattern (LBP), Complete Local Binary Pattern (CLBP), and Histogram of Oriented Gradients (HOG), using principal component analysis (PCA), which explained 95% of the variation in these descriptors. Then, via a fine-tuning approach, we designed three customized CNN models and their stack to recognize Bengali handwritten digits. On handcrafted image features, the XGBoost classifier achieved the best accuracy at 85.29%, an ROC AUC score of 98.67%, and precision, recall, and F1 scores ranging from 85.08% to 85.18%, indicating that there was still room for improvement. On our own data, the proposed customized CNN models and their stack model surpassed all other models, reaching a 99.66% training accuracy and a 97.57% testing accuracy. In addition, to robustify our proposed CNN model, we used another dataset of Bengali handwritten digits obtained from the Kaggle repository. Our stack CNN model provided remarkable performance. It obtained a training accuracy of 99.26% and an almost equally remarkable testing accuracy of 96.14%. Without any rigorous image preprocessing, fewer epochs, and less computation time, our proposed CNN model performed the best and proved the most resilient throughout all of the datasets, which solidified its position at the forefront of the field. Full article

(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications, 3rd Edition)

► Show Figures

Figure 1

16 pages, 1227 KiB

Open AccessArticle

A Multi-Feature Fusion Framework for Automatic Skin Cancer Diagnostics

by Samy Bakheet, Shtwai Alsubai, Aml El-Nagar and Abdullah Alqahtani

Diagnostics 2023, 13(8), 1474; https://doi.org/10.3390/diagnostics13081474 - 19 Apr 2023

Cited by 14 | Viewed by 3615

Abstract

Malignant melanoma is the most invasive skin cancer and is currently regarded as one of the deadliest disorders; however, it can be cured more successfully if detected and treated early. Recently, CAD (computer-aided diagnosis) systems have emerged as a powerful alternative tool for [...] Read more.

Malignant melanoma is the most invasive skin cancer and is currently regarded as one of the deadliest disorders; however, it can be cured more successfully if detected and treated early. Recently, CAD (computer-aided diagnosis) systems have emerged as a powerful alternative tool for the automatic detection and categorization of skin lesions, such as malignant melanoma or benign nevus, in given dermoscopy images. In this paper, we propose an integrated CAD framework for rapid and accurate melanoma detection in dermoscopy images. Initially, an input dermoscopy image is pre-processed by using a median filter and bottom-hat filtering for noise reduction, artifact removal, and, thus, enhancing the image quality. After this, each skin lesion is described by an effective skin lesion descriptor with high discrimination and descriptiveness capabilities, which is constructed by calculating the HOG (Histogram of Oriented Gradient) and LBP (Local Binary Patterns) and their extensions. After feature selection, the lesion descriptors are fed into three supervised machine learning classification models, namely SVM (Support Vector Machine), kNN (k-Nearest Neighbors), and GAB (Gentle AdaBoost), to diagnostically classify melanocytic skin lesions into one of two diagnostic categories, melanoma or nevus. Experimental results achieved using 10-fold cross-validation on the publicly available MED-NODEE dermoscopy image dataset demonstrate that the proposed CAD framework performs either competitively or superiorly to several state-of-the-art methods with stronger training settings in relation to various diagnostic metrics, such as accuracy (94%), specificity (92%), and sensitivity (100%). Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

► Show Figures

Figure 1

15 pages, 1139 KiB

Open AccessArticle

Masked Face Recognition Using Histogram-Based Recurrent Neural Network

by Wei-Jie Lucas Chong, Siew-Chin Chong and Thian-Song Ong

J. Imaging 2023, 9(2), 38; https://doi.org/10.3390/jimaging9020038 - 8 Feb 2023

Cited by 6 | Viewed by 3677

Abstract

Masked face recognition (MFR) is an interesting topic in which researchers have tried to find a better solution to improve and enhance performance. Recently, COVID-19 caused most of the recognition system fails to recognize facial images since the current face recognition cannot accurately [...] Read more.

Masked face recognition (MFR) is an interesting topic in which researchers have tried to find a better solution to improve and enhance performance. Recently, COVID-19 caused most of the recognition system fails to recognize facial images since the current face recognition cannot accurately capture or detect masked face images. This paper introduces the proposed method known as histogram-based recurrent neural network (HRNN) MFR to solve the undetected masked face problem. The proposed method includes the feature descriptor of histograms of oriented gradients (HOG) as the feature extraction process and recurrent neural network (RNN) as the deep learning process. We have proven that the combination of both approaches works well and achieves a high true acceptance rate (TAR) of 99 percent. In addition, the proposed method is designed to overcome the underfitting problem and reduce computational burdens with large-scale dataset training. The experiments were conducted on two benchmark datasets which are RMFD (Real-World Masked Face Dataset) and Labeled Face in the Wild Simulated Masked Face Dataset (LFW-SMFD) to vindicate the viability of the proposed HRNN method. Full article

(This article belongs to the Special Issue Image Processing and Biometric Facial Analysis)

► Show Figures

Figure 1

Search Results (60)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (60)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI