Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (219)

Search Parameters:
Keywords = HOG feature

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 2164 KB  
Article
Automatic Vehicle Recognition: A Practical Approach with VMMR and VCR
by Andrei Istrate, Madalin-George Boboc, Daniel-Tiberius Hritcu, Florin Rastoceanu, Constantin Grozea and Mihai Enache
AI 2025, 6(12), 329; https://doi.org/10.3390/ai6120329 - 18 Dec 2025
Viewed by 599
Abstract
Background: Automatic vehicle recognition has recently become an area of great interest, providing substantial support for multiple use cases, including law enforcement and surveillance applications. In real traffic conditions, where for various reasons license plate recognition is impossible or license plates are forged, [...] Read more.
Background: Automatic vehicle recognition has recently become an area of great interest, providing substantial support for multiple use cases, including law enforcement and surveillance applications. In real traffic conditions, where for various reasons license plate recognition is impossible or license plates are forged, alternative solutions are required to support human personnel in identifying vehicles used for illegal activities. In such cases, appearance-based approaches relying on vehicle make and model recognition (VMMR) and vehicle color recognition (VCR) can successfully complement license plate recognition. Methods: This research addresses appearance-based vehicle identification, in which VMMR and VCR rely on inherent visual cues such as body contours, stylistic details, and exterior color. In the first stage, vehicles passing through an intersection are detected, and essential visual characteristics are extracted for the two recognition tasks. The proposed system employs deep learning with semantic segmentation and data augmentation for color recognition, while histogram of oriented gradients (HOG) feature extraction combined with a support vector machine (SVM) classifier is used for make-model recognition. For the VCR task, five different neural network architectures are evaluated to identify the most effective solution. Results: The proposed system achieves an overall accuracy of 94.89% for vehicle make and model recognition. For vehicle color recognition, the best-performing models obtain a Top-1 accuracy of 94.17% and a Top-2 accuracy of 98.41%, demonstrating strong robustness under real-world traffic conditions. Conclusions: The experimental results show that the proposed automatic vehicle recognition system provides an efficient and reliable solution for appearance-based vehicle identification. By combining region-tailored data, segmentation-guided processing, and complementary recognition strategies, the system effectively supports real-world surveillance and law-enforcement scenarios where license plate recognition alone is insufficient. Full article
Show Figures

Figure 1

44 pages, 6045 KB  
Article
A Multi-Stage Hybrid Learning Model with Advanced Feature Fusion for Enhanced Prostate Cancer Classification
by Sameh Abd El-Ghany and A. A. Abd El-Aziz
Diagnostics 2025, 15(24), 3235; https://doi.org/10.3390/diagnostics15243235 - 17 Dec 2025
Viewed by 324
Abstract
Background: Cancer poses a significant health risk to humans, with prostate cancer (PCa) being the second most common and deadly form among men, following lung cancer. Each year, it affects over a million individuals and presents substantial diagnostic challenges due to variations [...] Read more.
Background: Cancer poses a significant health risk to humans, with prostate cancer (PCa) being the second most common and deadly form among men, following lung cancer. Each year, it affects over a million individuals and presents substantial diagnostic challenges due to variations in tissue appearance and imaging quality. In recent decades, various techniques utilizing Magnetic Resonance Imaging (MRI) have been developed for identifying and classifying PCa. Accurate classification in MRI typically requires the integration of complementary feature types, such as deep semantic representations from Convolutional Neural Networks (CNNs) and handcrafted descriptors like Histogram of Oriented Gradients (HOG). Therefore, a more robust and discriminative feature integration strategy is crucial for enhancing computer-aided diagnosis performance. Objectives: This study aims to develop a multi-stage hybrid learning model that combines deep and handcrafted features, investigates various feature reduction and classification techniques, and improves diagnostic accuracy for prostate cancer using magnetic resonance imaging. Methods: The proposed framework integrates deep features extracted from convolutional architectures with handcrafted texture descriptors to capture both semantic and structural information. Multiple dimensionality reduction methods, including singular value decomposition (SVD), were evaluated to optimize the fused feature space. Several machine learning (ML) classifiers were benchmarked to identify the most effective diagnostic configuration. The overall framework was validated using k-fold cross-validation to ensure reliability and minimize evaluation bias. Results: Experimental results on the Transverse Plane Prostate (TPP) dataset for binary classification tasks showed that the hybrid model significantly outperformed individual deep or handcrafted approaches, achieving superior accuracy of 99.74%, specificity of 99.87%, precision of 99.87%, sensitivity of 99.61%, and F1-score of 99.74%. Conclusions: By combining complementary feature extraction, dimensionality reduction, and optimized classification, the proposed model offers a reliable and generalizable solution for prostate cancer diagnosis and demonstrates strong potential for integration into intelligent clinical decision-support systems. Full article
Show Figures

Figure 1

26 pages, 35268 KB  
Article
TriEncoderNet: Multi-Stage Fusion of CNN, Transformer, and HOG Features for Forward-Looking Sonar Image Segmentation
by Jie Liu, Yan Dong, Guofang Chen, Yimin Chen, Jian Gao and Fubin Zhang
J. Mar. Sci. Eng. 2025, 13(12), 2295; https://doi.org/10.3390/jmse13122295 - 3 Dec 2025
Viewed by 329
Abstract
Forward-looking sonar (FLS) image segmentation is essential for underwater exploration with remaining challenges including low contrast, ambient noise, and complex backgrounds, which both existing traditional and deep learning-based methods fail to address effectively. This paper presents TriEncoderNet, a novel model that simultaneously extracts [...] Read more.
Forward-looking sonar (FLS) image segmentation is essential for underwater exploration with remaining challenges including low contrast, ambient noise, and complex backgrounds, which both existing traditional and deep learning-based methods fail to address effectively. This paper presents TriEncoderNet, a novel model that simultaneously extracts local, global, and edge-related features through three parallel encoders. Specifically, the model integrates a convolutional neural network (CNN) for local feature extraction, a transformer for global context modeling, and a histogram of oriented gradients (HOG) encoder for edge and shape detection. The key innovations of TriEncoderNet include the CrossFusionTransformer (CFT) module, which effectively integrates local and global features to capture both fine details and comprehensive context, and the HOG attention gate (HAG) module, which enhances edge detection and preserves semantic consistency across diverse feature types. Additionally, TriEncoderNet introduces the hierarchical efficient transformer (HETransformer) with a lightweight multi-head self-attention mechanism to reduce computational overhead while maintaining global context modeling capability. Experimental results on the marine debris dataset and UATD dataset demonstrate the superior performance of TriEncoderNet. Specifically, it achieves an mIoU of 0.793 and mAP of 0.916 on the marine debris dataset, and an mIoU of 0.582 and mAP of 0.687 on the UATD Dataset, outperforming state-of-the-art methods in both segmentation accuracy and robustness in challenging underwater environments. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

1077 KB  
Proceeding Paper
Enhanced Gait Recognition for Person Identification Using Spatio-Temporal Features and an Attention-Based Deep Learning Model
by Kollaparampil Thomas Thomas and Kaimadathil Pushpangadan Pushpalatha
Eng. Proc. 2025, 118(1), 101; https://doi.org/10.3390/ECSA-12-26532 - 7 Nov 2025
Abstract
Human gait has proven to be one of the standard biometrics for human identification. It is a non-invasive biometric method that uses human walking patterns specific to each human being. In most of the traditional methods, we use handcrafted features of simple convolutional [...] Read more.
Human gait has proven to be one of the standard biometrics for human identification. It is a non-invasive biometric method that uses human walking patterns specific to each human being. In most of the traditional methods, we use handcrafted features of simple convolutional models for gait analysis in human identification. Here, we may face challenges addressing complex temporal dependencies in gait sequences. This study proposes a novel deep learning framework that applies multi-feature input representations. It combines Gait Energy Images (GEIs), Frame Difference Gait Images (FDGIs), and Histogram of Oriented Gradients (HOG) features. This is proposed for enhancing the accuracy of human identification. The proposed work implements a CNN-based feature extractor with an attention mechanism for gait recognition. The model is trained and validated on a labeled dataset, showcasing its ability to learn discriminative gait representations with improved accuracy. The proposed pipeline of activities includes preprocessing and converting gait sequences into frames, organizing them using folder-based numerical extraction, followed by the training of an attention-enhanced convolutional network. The proposed model was found to perform better than existing methods on public datasets and works well even with different camera angles and clothing styles. Full article
Show Figures

Figure 1

36 pages, 4464 KB  
Article
Efficient Image-Based Memory Forensics for Fileless Malware Detection Using Texture Descriptors and LIME-Guided Deep Learning
by Qussai M. Yaseen, Esraa Oudat, Monther Aldwairi and Salam Fraihat
Computers 2025, 14(11), 467; https://doi.org/10.3390/computers14110467 - 1 Nov 2025
Viewed by 1318
Abstract
Memory forensics is an essential cybersecurity tool that comprehensively examines volatile memory to detect the malicious activity of fileless malware that can bypass disk analysis. Image-based detection techniques provide a promising solution by visualizing memory data into images to be used and analyzed [...] Read more.
Memory forensics is an essential cybersecurity tool that comprehensively examines volatile memory to detect the malicious activity of fileless malware that can bypass disk analysis. Image-based detection techniques provide a promising solution by visualizing memory data into images to be used and analyzed by image processing tools and machine learning methods. However, the effectiveness of image-based data for detection and classification requires high computational efforts. This paper investigates the efficacy of texture-based methods in detecting and classifying memory-resident or fileless malware using different image resolutions, identifying the best feature descriptors, classifiers, and resolutions that accurately classify malware into specific families and differentiate them from benign software. Moreover, this paper uses both local and global descriptors, where local descriptors include Oriented FAST and Rotated BRIEF (ORB), Scale-Invariant Feature Transform (SIFT), and Histogram of Oriented Gradients (HOG) and global descriptors include Discrete Wavelet Transform (DWT), GIST, and Gray Level Co-occurrence Matrix (GLCM). The results indicate that as image resolution increases, most feature descriptors yield more discriminative features but require higher computational efforts in terms of time and processing resources. To address this challenge, this paper proposes a novel approach that integrates Local Interpretable Model-agnostic Explanations (LIME) with deep learning models to automatically identify and crop the most important regions of memory images. The LIME’s ROI was extracted based on ResNet50 and MobileNet models’ predictions separately, the images were resized to 128 × 128, and the sampling process was performed dynamically to speed up LIME computation. The ROIs of the images are cropped to new images with sizes of (100 × 100) in two stages: the coarse stage and the fine stage. The two generated LIME-based cropped images using ResNet50 and MobileNet are fed to the lightweight neural network to evaluate the effectiveness of the LIME-based identified regions. The results demonstrate that the LIME-based MobileNet model’s prediction improves the efficiency of the model by preserving important features with a classification accuracy of 85% on multi-class classification. Full article
(This article belongs to the Special Issue Using New Technologies in Cyber Security Solutions (2nd Edition))
Show Figures

Figure 1

12 pages, 1247 KB  
Article
Artificial Intelligence-Assisted Wrist Radiography Analysis in Orthodontics: Classification of Maturation Stage
by Nursezen Kavasoglu, Omer Faruk Ertugrul, Seda Kotan, Yunus Hazar and Veysel Eratilla
Appl. Sci. 2025, 15(21), 11681; https://doi.org/10.3390/app152111681 - 31 Oct 2025
Viewed by 495
Abstract
This study aims to evaluate the ability of an artificial intelligence (AI) model developed for use in the field of orthodontics to accurately and reliably classify skeletal maturation stages of individuals using hand–wrist radiographs. A total of 809 grayscale hand–wrist radiographs (250 × [...] Read more.
This study aims to evaluate the ability of an artificial intelligence (AI) model developed for use in the field of orthodontics to accurately and reliably classify skeletal maturation stages of individuals using hand–wrist radiographs. A total of 809 grayscale hand–wrist radiographs (250 × 250 px; pre-peak n = 400, peak n = 100, post-peak n = 309) were analyzed using four complementary image-based feature extraction methods: Local Binary Pattern (LBP), Histogram of Oriented Gradients (HOG), Zernike Moments (ZM), and Intensity Histogram (IH). These methods generated 2355 features per image, of which 2099 were retained after variance thresholding. The most informative 1250 features were selected using the ANOVA F-test and classified with a stacking-based machine learning (ML) architecture composed of Light Gradient Boosting Machine (LightGBM) and Logistic Regression (LR) as base learners, and Random Forest (RF) as the meta-learner. Across all evaluation folds, the average performance of the model was Accuracy = 83.42%, Precision = 84.48%, Recall = 83.42%, and F1 = 83.50%. The proposed model achieved 87.5% accuracy, 87.8% precision, 87.5% recall, and an F1-score of 87.6% in 10-fold cross-validation, with a macro-average area under the ROC curve (AUC) of 0.96. The pre-peak stage, corresponding to the period of maximum growth velocity, was identified with 92.5% accuracy. These findings indicate that integrating handcrafted radiographic features with ensemble learning can enhance diagnostic precision, reduce observer variability, and accelerate evaluation. The model provides an interpretable and clinically applicable AI-based decision-support tool for skeletal maturity assessment in orthodontic practice. Full article
Show Figures

Figure 1

29 pages, 1239 KB  
Article
Uncovering Causal Factors Influencing Hog Prices: A Deep Granger Causality Inference Model for Multivariate Time Series Dynamics
by Xin Lai, Mingyu Xu, Bohan Ouyang, Wenkai Shi, Yumin Lai and Shiming Deng
Appl. Sci. 2025, 15(20), 11081; https://doi.org/10.3390/app152011081 - 16 Oct 2025
Viewed by 589
Abstract
The swine industry is vital to economic stability and household welfare in China and worldwide but remains highly vulnerable to price volatility driven by multiple factors. Capturing the underlying mechanisms of hog price formation is particularly challenging, as conventional models often fail to [...] Read more.
The swine industry is vital to economic stability and household welfare in China and worldwide but remains highly vulnerable to price volatility driven by multiple factors. Capturing the underlying mechanisms of hog price formation is particularly challenging, as conventional models often fail to represent its nonlinear structures and complex multivariate causal dependencies. This study proposes a Deep Granger Causality Inference (DGCI) model that integrates deep learning with causal inference to identify the key driving factors of hog price dynamics. The DGCI model contains a Feature Reconstruction Module (FRM) and a Granger Causality Module (GCM). The FRM integrates a Variational Autoencoder (VAE) with a Transformer to capture latent temporal representations of multivariate variables. Meanwhile, the GCM quantifies nonlinear Granger causality strength by systematically excluding features to measure their causal impact on hog price. Furthermore, this study proposes the Causal Feature Importance (CFI) metric, which jointly evaluates reconstruction fidelity and causal strength to identify key determinants. To evaluate the model performance, this study utilizes a real-world hog dataset from China. The results demonstrate considerable gains, with DGCI decreasing MSE by 17.59% to 39.22% and MSPE by 32.35% to 54.90% relative to baseline models. The DGCI model highlights pork price, piglet cost, and slaughter volume as the primary determinants of hog price, with CFI values of 1.5216, 1.4451, and 1.4266, respectively. By advancing understanding of the causal drivers of price volatility, this study contributes to informed decision-making, enhanced food security, and the sustainable development of the swine industry. Moreover, as a generalizable methodology, the proposed framework can be broadly applied to analyze the influencing factors of other agricultural and livestock products. Full article
(This article belongs to the Special Issue Applied Artificial Intelligence and Data Science)
Show Figures

Figure 1

25 pages, 12760 KB  
Article
Intelligent Face Recognition: Comprehensive Feature Extraction Methods for Holistic Face Analysis and Modalities
by Thoalfeqar G. Jarullah, Ahmad Saeed Mohammad, Musab T. S. Al-Kaltakchi and Jabir Alshehabi Al-Ani
Signals 2025, 6(3), 49; https://doi.org/10.3390/signals6030049 - 19 Sep 2025
Viewed by 2214
Abstract
Face recognition technology utilizes unique facial features to analyze and compare individuals for identification and verification purposes. This technology is crucial for several reasons, such as improving security and authentication, effectively verifying identities, providing personalized user experiences, and automating various operations, including attendance [...] Read more.
Face recognition technology utilizes unique facial features to analyze and compare individuals for identification and verification purposes. This technology is crucial for several reasons, such as improving security and authentication, effectively verifying identities, providing personalized user experiences, and automating various operations, including attendance monitoring, access management, and law enforcement activities. In this paper, comprehensive evaluations are conducted using different face detection and modality segmentation methods, feature extraction methods, and classifiers to improve system performance. As for face detection, four methods are proposed: OpenCV’s Haar Cascade classifier, Dlib’s HOG + SVM frontal face detector, Dlib’s CNN face detector, and Mediapipe’s face detector. Additionally, two types of feature extraction techniques are proposed: hand-crafted features (traditional methods: global local features) and deep learning features. Three global features were extracted, Scale-Invariant Feature Transform (SIFT), Speeded Robust Features (SURF), and Global Image Structure (GIST). Likewise, the following local feature methods are utilized: Local Binary Pattern (LBP), Weber local descriptor (WLD), and Histogram of Oriented Gradients (HOG). On the other hand, the deep learning-based features fall into two categories: convolutional neural networks (CNNs), including VGG16, VGG19, and VGG-Face, and Siamese neural networks (SNNs), which generate face embeddings. For classification, three methods are employed: Support Vector Machine (SVM), a one-class SVM variant, and Multilayer Perceptron (MLP). The system is evaluated on three datasets: in-house, Labelled Faces in the Wild (LFW), and the Pins dataset (sourced from Pinterest) providing comprehensive benchmark comparisons for facial recognition research. The best performance accuracy for the proposed ten-feature extraction methods applied to the in-house database in the context of the facial recognition task achieved 99.8% accuracy by using the VGG16 model combined with the SVM classifier. Full article
Show Figures

Figure 1

19 pages, 8262 KB  
Article
Oil Spill Identification with Marine Radar Using Feature Augmentation and Improved Firefly Optimization Algorithm
by Jin Xu, Boxi Yao, Haihui Dong, Zekun Guo, Bo Xu, Yuanyuan Huang, Bo Li, Sihan Qian and Bingxin Liu
Remote Sens. 2025, 17(18), 3148; https://doi.org/10.3390/rs17183148 - 10 Sep 2025
Viewed by 785
Abstract
Oil spill accidents pose a grave threat to marine ecosystems, human economy, and public health. Consequently, expeditious and efficacious oil spill detection technology is imperative for the pollution mitigation and the health preservation in the marine environment. This study proposed a marine radar [...] Read more.
Oil spill accidents pose a grave threat to marine ecosystems, human economy, and public health. Consequently, expeditious and efficacious oil spill detection technology is imperative for the pollution mitigation and the health preservation in the marine environment. This study proposed a marine radar oil spill detection method based on Local Binary Patterns (LBP), Histogram of Oriented Gradient (HOG), and an improved Firefly Optimization Algorithm (IFA). In the stage of image pre-processing, the oil film features were significantly enhanced through three steps. The LBP features were extracted from the preprocessed image. Then, the mean filtering was used to smooth out the LBP features. Subsequently, the HOG statistical features were extracted from the filtered LBP feature map. After the feature enhancement, the oil spill regions were accurately extracted by using K-Means clustering algorithm. Next, an IFA model was used to classify oil films. Compared with traditional Firefly Optimization Algorithm (FA) algorithm, the IFA method is suitable for oil film segmentation tasks in marine radar data. The proposed method can achieve accuracy segmentation and provide a new technical path for marine oil spill monitoring. Full article
Show Figures

Figure 1

27 pages, 4065 KB  
Article
Synthesis and Antimicrobial Evaluation of Chroman-4-One and Homoisoflavonoid Derivatives
by Carlos d. S. M. Bezerra Filho, José L. F. M. Galvão, Edeltrudes O. Lima, Yunierkis Perez-Castillo, Yendrek Velásquez-López and Damião P. de Sousa
Molecules 2025, 30(17), 3575; https://doi.org/10.3390/molecules30173575 - 31 Aug 2025
Viewed by 2310
Abstract
The continuous increase in microbial resistance to therapeutic agents has become one of the greatest challenges to global health. In this context, the present study investigated the bioactivity of 25 chroman-4-one and homoisoflavonoid derivatives—17 of which are novel—against pathogenic microorganisms, including Staphylococcus epidermidis [...] Read more.
The continuous increase in microbial resistance to therapeutic agents has become one of the greatest challenges to global health. In this context, the present study investigated the bioactivity of 25 chroman-4-one and homoisoflavonoid derivatives—17 of which are novel—against pathogenic microorganisms, including Staphylococcus epidermidis, Pseudomonas aeruginosa, Salmonella enteritidis, Candida albicans, C. tropicalis, Nakaseomyces glabratus (formerly C. glabrata), Aspergillus flavus, and Penicillium citrinum. Antimicrobial assay was performed using the microdilution technique in 96-well microplates to determine the minimum inhibitory concentration (MIC). Thirteen compounds exhibited antimicrobial activity, with compounds 1, 2, and 21 demonstrating greater potency than the positive control, especially against Candida species. Molecular modeling suggested distinct mechanisms of action in Candida albicans: 1 potentially inhibits cysteine synthase, while 2 and 21 possibly target HOG1 kinase and FBA1, key proteins in fungal virulence and survival. Our findings indicated that the addition of alkyl or aryl carbon chains at the hydroxyl group at position 7 reduces antimicrobial activity, whereas the presence of methoxy substituents at the meta position of ring B in homoisoflavonoids enhances bioactivity. These findings highlight key structural features of these compound classes, which may aid in the development of new bioactive agents against pathogenic microorganisms. Full article
Show Figures

Graphical abstract

30 pages, 4741 KB  
Article
TriViT-Lite: A Compact Vision Transformer–MobileNet Model with Texture-Aware Attention for Real-Time Facial Emotion Recognition in Healthcare
by Waqar Riaz, Jiancheng (Charles) Ji and Asif Ullah
Electronics 2025, 14(16), 3256; https://doi.org/10.3390/electronics14163256 - 16 Aug 2025
Cited by 2 | Viewed by 948
Abstract
Facial emotion recognition has become increasingly important in healthcare, where understanding delicate cues like pain, discomfort, or unconsciousness can support more timely and responsive care. Yet, recognizing facial expressions in real-world settings remains challenging due to varying lighting, facial occlusions, and hardware limitations [...] Read more.
Facial emotion recognition has become increasingly important in healthcare, where understanding delicate cues like pain, discomfort, or unconsciousness can support more timely and responsive care. Yet, recognizing facial expressions in real-world settings remains challenging due to varying lighting, facial occlusions, and hardware limitations in clinical environments. To address this, we propose TriViT-Lite, a lightweight yet powerful model that blends three complementary components: MobileNet, for capturing fine-grained local features efficiently; Vision Transformers (ViT), for modeling global facial patterns; and handcrafted texture descriptors, such as Local Binary Patterns (LBP) and Histograms of Oriented Gradients (HOG), for added robustness. These multi-scale features are brought together through a texture-aware cross-attention fusion mechanism that helps the model focus on the most relevant facial regions dynamically. TriViT-Lite is evaluated on both benchmark datasets (FER2013, AffectNet) and a custom healthcare-oriented dataset covering seven critical emotional states, including pain and unconsciousness. It achieves a competitive accuracy of 91.8% on FER2013 and of 87.5% on the custom dataset while maintaining real-time performance (~15 FPS) on resource-constrained edge devices. Our results show that TriViT-Lite offers a practical and accurate solution for real-time emotion recognition, particularly in healthcare settings. It strikes a balance between performance, interpretability, and efficiency, making it a strong candidate for machine-learning-driven pattern recognition in patient-monitoring applications. Full article
Show Figures

Figure 1

16 pages, 2943 KB  
Article
Long Short-Term Memory-Based Fall Detection by Frequency-Modulated Continuous Wave Millimeter-Wave Radar Sensor for Seniors Living Alone
by Yun Seop Yu, Seongjo Wie, Hojin Lee, Jeongwoo Lee and Nam Ho Kim
Appl. Sci. 2025, 15(15), 8381; https://doi.org/10.3390/app15158381 - 28 Jul 2025
Cited by 3 | Viewed by 3887
Abstract
In this study, four types of fall detection systems for seniors living alone using x-y scatter and Doppler range images measured from frequency-modulated continuous wave (FMCW) millimeter-wave (mmWave) sensors were introduced. Despite advancements in fall detection, existing long short-term memory (LSTM)-based approaches often [...] Read more.
In this study, four types of fall detection systems for seniors living alone using x-y scatter and Doppler range images measured from frequency-modulated continuous wave (FMCW) millimeter-wave (mmWave) sensors were introduced. Despite advancements in fall detection, existing long short-term memory (LSTM)-based approaches often struggle with effectively distinguishing falls from similar activities of daily living (ADLs) due to their uniform treatment of all time steps, potentially overlooking critical motion cues. To address this limitation, an attention mechanism has been integrated. Data was collected from seven participants, resulting in a dataset of 669 samples, including 285 falls and 384 ADLs with walking, lying, inactivity, and sitting. Four LSTM-based architectures for fall detection were proposed and evaluated: Raw-LSTM, Raw-LSTM-Attention, HOG-LSTM, and HOG-LSTM-Attention. The histogram of oriented gradient (HOG) method was used for feature extraction, while LSTM networks captured temporal dependencies. The attention mechanism further enhanced model performance by focusing on relevant input features. The Raw-LSTM model processed raw mmWave radar images through LSTM layers and dense layers for classification. The Raw-LSTM-Attention model extended Raw-LSTM with an added self-attention mechanism within the traditional attention framework. The HOG-LSTM model included an additional preprocessing step upon the RAW-LSTM model where HOG features were extracted and classified using an SVM. The HOG-LSTM-Attention model built upon the HOG-LSTM model by incorporating a self-attention mechanism to enhance the model’s ability to accurately classify activities. Evaluation metrics such as Sensitivity, Precision, Accuracy, and F1-Score were used to compare four architectural models. The results showed that the HOG-LSTM-Attention model achieved the highest performance, with an Accuracy of 95.3% and an F1-Score of 95.5%. Optimal self-attention configuration was found at a 2:64 ratio of number of attention heads to channels for keys and queries. Full article
Show Figures

Figure 1

21 pages, 3448 KB  
Article
A Welding Defect Detection Model Based on Hybrid-Enhanced Multi-Granularity Spatiotemporal Representation Learning
by Chenbo Shi, Shaojia Yan, Lei Wang, Changsheng Zhu, Yue Yu, Xiangteng Zang, Aiping Liu, Chun Zhang and Xiaobing Feng
Sensors 2025, 25(15), 4656; https://doi.org/10.3390/s25154656 - 27 Jul 2025
Viewed by 1738
Abstract
Real-time quality monitoring using molten pool images is a critical focus in researching high-quality, intelligent automated welding. To address interference problems in molten pool images under complex welding scenarios (e.g., reflected laser spots from spatter misclassified as porosity defects) and the limited interpretability [...] Read more.
Real-time quality monitoring using molten pool images is a critical focus in researching high-quality, intelligent automated welding. To address interference problems in molten pool images under complex welding scenarios (e.g., reflected laser spots from spatter misclassified as porosity defects) and the limited interpretability of deep learning models, this paper proposes a multi-granularity spatiotemporal representation learning algorithm based on the hybrid enhancement of handcrafted and deep learning features. A MobileNetV2 backbone network integrated with a Temporal Shift Module (TSM) is designed to progressively capture the short-term dynamic features of the molten pool and integrate temporal information across both low-level and high-level features. A multi-granularity attention-based feature aggregation module is developed to select key interference-free frames using cross-frame attention, generate multi-granularity features via grouped pooling, and apply the Convolutional Block Attention Module (CBAM) at each granularity level. Finally, these multi-granularity spatiotemporal features are adaptively fused. Meanwhile, an independent branch utilizes the Histogram of Oriented Gradient (HOG) and Scale-Invariant Feature Transform (SIFT) features to extract long-term spatial structural information from historical edge images, enhancing the model’s interpretability. The proposed method achieves an accuracy of 99.187% on a self-constructed dataset. Additionally, it attains a real-time inference speed of 20.983 ms per sample on a hardware platform equipped with an Intel i9-12900H CPU and an RTX 3060 GPU, thus effectively balancing accuracy, speed, and interpretability. Full article
(This article belongs to the Topic Applied Computing and Machine Intelligence (ACMI))
Show Figures

Figure 1

16 pages, 3953 KB  
Article
Skin Lesion Classification Using Hybrid Feature Extraction Based on Classical and Deep Learning Methods
by Maryem Zahid, Mohammed Rziza and Rachid Alaoui
BioMedInformatics 2025, 5(3), 41; https://doi.org/10.3390/biomedinformatics5030041 - 16 Jul 2025
Cited by 1 | Viewed by 2428
Abstract
This paper proposes a hybrid method for skin lesion classification combining deep learning features with conventional descriptors such as HOG, Gabor, SIFT, and LBP. Feature extraction was performed by extracting features of interest within the tumor area using suggested fusion methods. We tested [...] Read more.
This paper proposes a hybrid method for skin lesion classification combining deep learning features with conventional descriptors such as HOG, Gabor, SIFT, and LBP. Feature extraction was performed by extracting features of interest within the tumor area using suggested fusion methods. We tested and compared features obtained from different deep learning models coupled to HOG-based features. Dimensionality reduction and performance improvement were achieved by Principal Component Analysis, after which SVM was used for classification. The compared methods were tested on the reference database skin cancer-malignant-vs-benign. The results show a significant improvement in terms of accuracy due to complementarity between the conventional and deep learning-based methods. Specifically, the addition of HOG descriptors led to an accuracy increase of 5% for EfficientNetB0, 7% for ResNet50, 5% for ResNet101, 1% for NASNetMobile, 1% for DenseNet201, and 1% for MobileNetV2. These findings confirm that feature fusion significantly enhances performance compared to the individual application of each method. Full article
Show Figures

Figure 1

13 pages, 2266 KB  
Article
The Detection and Classification of Grape Leaf Diseases with an Improved Hybrid Model Based on Feature Engineering and AI
by Fatih Atesoglu and Harun Bingol
AgriEngineering 2025, 7(7), 228; https://doi.org/10.3390/agriengineering7070228 - 9 Jul 2025
Cited by 1 | Viewed by 2319
Abstract
There are many products obtained from grapes. The early detection of diseases in an economically important fruit is important, and the spread of disease significantly increases financial losses. In recent years, it is known that artificial intelligence techniques have achieved very successful results [...] Read more.
There are many products obtained from grapes. The early detection of diseases in an economically important fruit is important, and the spread of disease significantly increases financial losses. In recent years, it is known that artificial intelligence techniques have achieved very successful results in image classification. Therefore, the early detection and classification of grape diseases with the latest artificial intelligence techniques and feature reduction techniques was carried out within the scope of this study. The most well-known convolutional neural network (CNN) architectures, texture-based Local Binary Pattern (LBP) and Histogram of Oriented Gradients (HOG) methods, Neighborhood Component Analysis (NCA), feature reduction methods, and machine learning (ML) techniques are the methods used in this article. The proposed hybrid model was compared with two texture-based and four CNN models. The features from the most successful CNN model and texture-based architectures were combined. The NCA method was used to select the best features from the obtained feature map, and the model was classified using the best-known ML classifiers. Our proposed model achieved an accuracy value of 99.1%. This value shows that our model can be used in the detection of grape diseases. Full article
(This article belongs to the Special Issue Implementation of Artificial Intelligence in Agriculture)
Show Figures

Figure 1

Back to TopTop