Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (10)

Search Parameters:
Keywords = lightweight skeleton feature extraction

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 3125 KiB  
Article
Classification of Complex Power Quality Disturbances Based on Lissajous Trajectory and Lightweight DenseNet
by Xi Zhang, Jianyong Zheng, Fei Mei and Huiyu Miao
Appl. Sci. 2025, 15(14), 8021; https://doi.org/10.3390/app15148021 - 18 Jul 2025
Viewed by 201
Abstract
With the increase in the penetration rate of distributed sources and loads, the sensor monitoring data is increasing dramatically. Power grid maintenance services require a rapid response in power quality data analysis. To achieve a rapid response and highly accurate classification of power [...] Read more.
With the increase in the penetration rate of distributed sources and loads, the sensor monitoring data is increasing dramatically. Power grid maintenance services require a rapid response in power quality data analysis. To achieve a rapid response and highly accurate classification of power quality disturbances (PQDs), this paper proposes an efficient classification algorithm for PQDs based on Lissajous trajectory (LT) and a lightweight DenseNet, which utilizes the concept of Lissajous curves to construct an ideal reference signal and combines it with the original PQD signal to synthesize a feature trajectory with a distinctive shape. Meanwhile, to enhance the ability and efficiency of capturing trajectory features, a lightweight L-DenseNet skeleton model is designed, and its feature extraction capability is further improved by integrating an attention mechanism with L-DenseNet. Finally, the LT image is input into the fusion model for training, and PQD classification is achieved using the optimally trained model. The experimental results demonstrate that, compared with current mainstream PQD classification methods, the proposed algorithm not only achieves superior disturbance classification accuracy and noise robustness but also significantly improves response speed in PQD classification tasks through its concise visualization conversion process and lightweight model design. Full article
Show Figures

Figure 1

20 pages, 5700 KiB  
Article
Multimodal Personality Recognition Using Self-Attention-Based Fusion of Audio, Visual, and Text Features
by Hyeonuk Bhin and Jongsuk Choi
Electronics 2025, 14(14), 2837; https://doi.org/10.3390/electronics14142837 - 15 Jul 2025
Viewed by 380
Abstract
Personality is a fundamental psychological trait that exerts a long-term influence on human behavior patterns and social interactions. Automatic personality recognition (APR) has exhibited increasing importance across various domains, including Human–Robot Interaction (HRI), personalized services, and psychological assessments. In this study, we propose [...] Read more.
Personality is a fundamental psychological trait that exerts a long-term influence on human behavior patterns and social interactions. Automatic personality recognition (APR) has exhibited increasing importance across various domains, including Human–Robot Interaction (HRI), personalized services, and psychological assessments. In this study, we propose a multimodal personality recognition model that classifies the Big Five personality traits by extracting features from three heterogeneous sources: audio processed using Wav2Vec2, video represented as Skeleton Landmark time series, and text encoded through Bidirectional Encoder Representations from Transformers (BERT) and Doc2Vec embeddings. Each modality is handled through an independent Self-Attention block that highlights salient temporal information, and these representations are then summarized and integrated using a late fusion approach to effectively reflect both the inter-modal complementarity and cross-modal interactions. Compared to traditional recurrent neural network (RNN)-based multimodal models and unimodal classifiers, the proposed model achieves an improvement of up to 12 percent in the F1-score. It also maintains a high prediction accuracy and robustness under limited input conditions. Furthermore, a visualization based on t-distributed Stochastic Neighbor Embedding (t-SNE) demonstrates clear distributional separation across the personality classes, enhancing the interpretability of the model and providing insights into the structural characteristics of its latent representations. To support real-time deployment, a lightweight thread-based processing architecture is implemented, ensuring computational efficiency. By leveraging deep learning-based feature extraction and the Self-Attention mechanism, we present a novel personality recognition framework that balances performance with interpretability. The proposed approach establishes a strong foundation for practical applications in HRI, counseling, education, and other interactive systems that require personalized adaptation. Full article
(This article belongs to the Special Issue Explainable Machine Learning and Data Mining)
Show Figures

Figure 1

15 pages, 1463 KiB  
Article
Spatial–Temporal Heatmap Masked Autoencoder for Skeleton-Based Action Recognition
by Cunling Bian, Yang Yang, Tao Wang and Weigang Lu
Sensors 2025, 25(10), 3146; https://doi.org/10.3390/s25103146 - 16 May 2025
Viewed by 617
Abstract
Skeleton representation learning offers substantial advantages for action recognition by encoding intricate motion details and spatial–temporal dependencies among joints. However, fully supervised approaches necessitate large amounts of annotated data, which are often labor-intensive and costly to acquire. In this work, we propose the [...] Read more.
Skeleton representation learning offers substantial advantages for action recognition by encoding intricate motion details and spatial–temporal dependencies among joints. However, fully supervised approaches necessitate large amounts of annotated data, which are often labor-intensive and costly to acquire. In this work, we propose the Spatial–Temporal Heatmap Masked Autoencoder (STH-MAE), a novel self-supervised framework tailored for skeleton-based action recognition. Unlike coordinate-based methods, STH-MAE adopts heatmap volumes as its primary representation, mitigating noise inherent in pose estimation while capitalizing on advances in Vision Transformers. The framework constructs a spatial–temporal heatmap (STH) by aggregating 2D joint heatmaps across both spatial and temporal axes. This STH is partitioned into non-overlapping patches to facilitate local feature learning, with a masking strategy applied to randomly conceal portions of the input. During pre-training, a Vision Transformer-based autoencoder equipped with a lightweight prediction head reconstructs the masked regions, fostering the extraction of robust and transferable skeletal representations. Comprehensive experiments on the NTU RGB+D 60 and NTU RGB+D 120 benchmarks demonstrate the superiority of STH-MAE, achieving state-of-the-art performance under multiple evaluation protocols. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

30 pages, 19525 KiB  
Article
Disease Monitoring and Characterization of Feeder Road Network Based on Improved YOLOv11
by Ying Fan, Kun Zhi, Haichao An, Runyin Gu, Xiaobing Ding and Jianhua Tang
Electronics 2025, 14(9), 1818; https://doi.org/10.3390/electronics14091818 - 29 Apr 2025
Viewed by 653
Abstract
In response to the challenges of the low accuracy and high misdetection and omission rate of disease detection on feeder roads, an improved Rural-YOLO (SAConv-C2f+C2PSA_CAA+MCSAttention+WIOU) disease detection algorithm is proposed in this paper, which is an enhanced target detection framework based on the [...] Read more.
In response to the challenges of the low accuracy and high misdetection and omission rate of disease detection on feeder roads, an improved Rural-YOLO (SAConv-C2f+C2PSA_CAA+MCSAttention+WIOU) disease detection algorithm is proposed in this paper, which is an enhanced target detection framework based on the YOLOv11 architecture, for the identification of common diseases in the complex feeder road environment. The proposed methodology introduces four key innovations: (1) Switchable Atrous Convolution (SAConv) is introduced into the backbone network to enhance multiscale disease feature extraction under occlusion conditions; (2) Multi-Channel and Spatial Attention (MCSAttention) is constructed in the feature fusion process, and the weight distribution of multiscale diseases is adjusted through adaptive weight redistribution. By adjusting the weight distribution, the model’s sensitivity to subtle disease features is improved. To enhance its ability to discriminate between different disease types, Cross Stage Partial with Parallel Spatial Attention and Channel Adaptive Aggregation (C2PSA_CAA) is constructed at the end of the backbone network. (3) To mitigate category imbalance issues, Weighted Intersection over Union loss (WIoU_loss) is introduced, which helps optimize the bounding box regression process in disease detection and improve the detection of relevant diseases. Based on experimental validation, Rural-YOLO demonstrated superior performance with minimal computational overhead. Only 0.7 M additional parameters is required, and an 8.4% improvement in recall and a 7.8% increase in mAP50 were achieved compared to the initial models. The optimized architecture also reduced the model size by 21%. The test results showed that the proposed model achieved 3.28 M parameters with a computational complexity of 5.0 GFLOPs, meeting the requirements for lightweight deployment scenarios. Cross-validation on multi-scenario public datasets was carried out, and the model’s robustness across diverse road conditions. In the quantitative experiments, the center skeleton method and the maximum internal tangent circle method were used to calculate crack width, and the pixel occupancy ratio method was used to assess the area damage degree of potholes and other diseases. The measurements were converted to actual physical dimensions using a calibrated scale of 0.081:1. Full article
Show Figures

Figure 1

24 pages, 3282 KiB  
Article
Video Abnormal Behavior Recognition and Trajectory Prediction Based on Lightweight Skeleton Feature Extraction
by Ling Wang, Cong Ding, Yifan Zhang, Tie Hua Zhou, Wei Ding, Keun Ho Ryu and Kwang Woo Nam
Sensors 2024, 24(12), 3711; https://doi.org/10.3390/s24123711 - 7 Jun 2024
Cited by 2 | Viewed by 1383
Abstract
Video action recognition based on skeleton nodes is a highlighted issue in the computer vision field. In real application scenarios, the large number of skeleton nodes and behavior occlusion problems between individuals seriously affect recognition speed and accuracy. Therefore, we proposed a lightweight [...] Read more.
Video action recognition based on skeleton nodes is a highlighted issue in the computer vision field. In real application scenarios, the large number of skeleton nodes and behavior occlusion problems between individuals seriously affect recognition speed and accuracy. Therefore, we proposed a lightweight multi-stream feature cross-fusion (L-MSFCF) model to recognize abnormal behaviors such as fighting, vicious kicking, climbing over the wall, et al., which could obviously improve recognition speed based on lightweight skeleton node calculation, and improve recognition accuracy based on occluded skeleton node prediction analysis in order to effectively solve the behavior occlusion problem. The experiments show that our proposed All-MSFCF model has a video action recognition average accuracy rate of 92.7% for eight kinds of abnormal behavior recognition. Although our proposed lightweight L-MSFCF model has an 87.3% average accuracy rate, its average recognition speed is 62.7% higher than the full-skeleton recognition model, which is more suitable for solving real-time tracing problems. Moreover, our proposed Trajectory Prediction Tracking (TPT) model could real-time predict the moving positions based on the dynamically selected core skeleton node calculation, especially for the short-term prediction within 15 frames and 30 frames that have lower average loss errors. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

15 pages, 5133 KiB  
Article
Human Action Recognition Based on Skeleton Information and Multi-Feature Fusion
by Li Wang, Bo Su, Qunpo Liu, Ruxin Gao, Jianjun Zhang and Guodong Wang
Electronics 2023, 12(17), 3702; https://doi.org/10.3390/electronics12173702 - 1 Sep 2023
Cited by 9 | Viewed by 2193
Abstract
Action assessment and feedback can effectively assist fitness practitioners in improving exercise benefits. In this paper, we address key challenges in human action recognition and assessment by proposing innovative methods that enhance performance while reducing computational complexity. Firstly, we present Oct-MobileNet, a lightweight [...] Read more.
Action assessment and feedback can effectively assist fitness practitioners in improving exercise benefits. In this paper, we address key challenges in human action recognition and assessment by proposing innovative methods that enhance performance while reducing computational complexity. Firstly, we present Oct-MobileNet, a lightweight backbone network, to overcome the limitations of the traditional OpenPose algorithm’s VGG19 network, which exhibits a large parameter size and high device requirements. Oct-MobileNet employs octave convolution and attention mechanisms to improve the extraction of high-frequency features from the human body contour, resulting in enhanced accuracy with reduced model computational burden. Furthermore, we introduce a novel approach for action recognition that combines skeleton-based information and multiple feature fusion. By extracting spatial geometric and temporal characteristics from actions, we employ a sliding window algorithm to integrate these features. Experimental results show the effectiveness of our approach, demonstrating its ability to accurately recognize and classify various human actions. Additionally, we address the evaluation of traditional fitness exercises, specifically focusing on the BaDunJin movements. We propose a multimodal information-based assessment method that combines pose detection and keypoint analysis. Label sequences are obtained through a pose detector and each frame’s keypoint coordinates are represented as pose vectors. Leveraging multimodal information, including label sequences and pose vectors, we explore action similarity and perform quantitative evaluations to help exercisers assess the quality of their exercise performance. Full article
(This article belongs to the Special Issue Human Computer Interaction in Intelligent System)
Show Figures

Figure 1

19 pages, 4363 KiB  
Article
TFC-GCN: Lightweight Temporal Feature Cross-Extraction Graph Convolutional Network for Skeleton-Based Action Recognition
by Kaixuan Wang and Hongmin Deng
Sensors 2023, 23(12), 5593; https://doi.org/10.3390/s23125593 - 15 Jun 2023
Cited by 7 | Viewed by 3245
Abstract
For skeleton-based action recognition, graph convolutional networks (GCN) have absolute advantages. Existing state-of-the-art (SOTA) methods tended to focus on extracting and identifying features from all bones and joints. However, they ignored many new input features which could be discovered. Moreover, many GCN-based action [...] Read more.
For skeleton-based action recognition, graph convolutional networks (GCN) have absolute advantages. Existing state-of-the-art (SOTA) methods tended to focus on extracting and identifying features from all bones and joints. However, they ignored many new input features which could be discovered. Moreover, many GCN-based action recognition models did not pay sufficient attention to the extraction of temporal features. In addition, most models had swollen structures due to too many parameters. In order to solve the problems mentioned above, a temporal feature cross-extraction graph convolutional network (TFC-GCN) is proposed, which has a small number of parameters. Firstly, we propose the feature extraction strategy of the relative displacements of joints, which is fitted for the relative displacement between its previous and subsequent frames. Then, TFC-GCN uses a temporal feature cross-extraction block with gated information filtering to excavate high-level representations for human actions. Finally, we propose a stitching spatial–temporal attention (SST-Att) block for different joints to be given different weights so as to obtain favorable results for classification. FLOPs and the number of parameters of TFC-GCN reach 1.90 G and 0.18 M, respectively. The superiority has been verified on three large-scale public datasets, namely NTU RGB + D60, NTU RGB + D120 and UAV-Human. Full article
Show Figures

Figure 1

18 pages, 3608 KiB  
Article
Research on a U-Net Bridge Crack Identification and Feature-Calculation Methods Based on a CBAM Attention Mechanism
by Huifeng Su, Xiang Wang, Tao Han, Ziyi Wang, Zhongxiao Zhao and Pengfei Zhang
Buildings 2022, 12(10), 1561; https://doi.org/10.3390/buildings12101561 - 28 Sep 2022
Cited by 53 | Viewed by 4624
Abstract
Crack detection on bridges is an important part of assessing whether a bridge is safe for service. The methods using manual inspection and bridge-inspection vehicles have disadvantages, such as low efficiency and affecting road traffic. We have conducted an in-depth study of bridge-crack [...] Read more.
Crack detection on bridges is an important part of assessing whether a bridge is safe for service. The methods using manual inspection and bridge-inspection vehicles have disadvantages, such as low efficiency and affecting road traffic. We have conducted an in-depth study of bridge-crack detection methods and have proposed a bridge crack identification algorithm for Unet, called the CBAM-Unet algorithm. CBAM (Convolutional Block Attention Module) is a lightweight convolutional attention module that combines a channel attention module (CAM) and a spatial attention module (SAM), which use an attention mechanism on a channel and spatially, respectively. CBAM takes into account the characteristics of bridge cracks. When the attention mechanism is used, the ability to express shallow feature information is enhanced, making the identified cracks more complete and accurate. Experimental results show that the algorithm can achieve an accuracy of 92.66% for crack identification. We used Gaussian fuzzy, Otsu and medial skeletonization algorithms to realise the post-processing of an image and obtain a medial skeleton map. A crack feature measurement algorithm based on the skeletonised image is proposed, which completes the measurement of the maximum width and length of the crack with errors of 1–6% and 1–8%, respectively, meeting the detection standard. The bridge crack feature extraction algorithm we present, CBAM-Unet, can effectively complete the crack-identification task, and the obtained image segmentation accuracy and parameter calculation meet the standards and requirements. This method greatly improves detection efficiency and accuracy, reduces detection costs and improves detection efficiency. Full article
(This article belongs to the Special Issue Structural Health Monitoring of Buildings, Bridges and Dams)
Show Figures

Figure 1

14 pages, 1562 KiB  
Article
A Lightweight Subgraph-Based Deep Learning Approach for Fall Recognition
by Zhenxiao Zhao, Lei Zhang and Huiliang Shang
Sensors 2022, 22(15), 5482; https://doi.org/10.3390/s22155482 - 22 Jul 2022
Cited by 9 | Viewed by 2379
Abstract
Falls pose a great danger to social development, especially to the elderly population. When a fall occurs, the body’s center of gravity moves from a high position to a low position, and the magnitude of change varies among body parts. Most existing fall [...] Read more.
Falls pose a great danger to social development, especially to the elderly population. When a fall occurs, the body’s center of gravity moves from a high position to a low position, and the magnitude of change varies among body parts. Most existing fall recognition methods based on deep learning have not yet considered the differences between the movement and the change in amplitude of each body part. Besides, some problems exist such as complicated design, slow detection speed, and lack of timeliness. To alleviate these problems, a lightweight subgraph-based deep learning method utilizing skeleton information for fall recognition is proposed in this paper. The skeleton information of the human body is extracted by OpenPose, and an end-to-end lightweight subgraph-based network is designed. Sub-graph division and sub-graph attention modules are introduced to add a larger perceptual field while maintaining its lightweight characteristics. A multi-scale temporal convolution module is also designed to extract and fuse multi-scale temporal features, which enriches the feature representation. The proposed method is evaluated on a partial fall dataset collected in NTU and on two public datasets, and outperforms existing methods. It indicates that the proposed method is accurate and lightweight, which means it is suitable for real-time detection and rapid response to falls. Full article
(This article belongs to the Special Issue Vision and Sensor-Based Sensing in Human Action Recognition)
Show Figures

Figure 1

17 pages, 4599 KiB  
Article
Novel Bionic Design Method for Skeleton Structures Based on Load Path Analysis
by Zhaohua Wang, Nan Wu, Qingguo Wang, Yongxin Li, Quanwei Yang and Fenghe Wu
Appl. Sci. 2020, 10(22), 8251; https://doi.org/10.3390/app10228251 - 20 Nov 2020
Cited by 17 | Viewed by 4889
Abstract
Biological structures have excellent mechanical performances including lightweight, high stiffness, etc. However, these are difficult to apply directly to some given complex structures, such as automobile frame, control arm, etc. In this study, a novel bionic design method for skeleton structures with complex [...] Read more.
Biological structures have excellent mechanical performances including lightweight, high stiffness, etc. However, these are difficult to apply directly to some given complex structures, such as automobile frame, control arm, etc. In this study, a novel bionic design method for skeleton structures with complex features is proposed by the bio-inspired idea of “main-branch and sub-branch”. The envelope model of a given part is established by analyzing the structural functions and working conditions, and the load path is extracted by the load-transferred law as the structural main-branch. Then, the selection criterion of bionic prototype is established from three aspects: load similarity, structural similarity and manufacturability. The cross-sections with high similarities are selected as the structural sub-branch. Finally, the multi-objective size optimization is carried out and a new model is established. The bionic design of a control arm is carried out by the method: structural main-branch is obtained by the load path analysis and structural sub-branch is occupied by the fish-bone structure. The design result shows that the structural stiffness is increased by 62.3%, while the weight is reduced by 24.75%. The method can also be used for other fields including automobile, aerospace and civil engineering. Full article
(This article belongs to the Special Issue New Trends in Design Engineering)
Show Figures

Figure 1

Back to TopTop