MDPI - Publisher of Open Access Journals

24 pages, 5022 KiB

Open AccessArticle

Aging-Invariant Sheep Face Recognition Through Feature Decoupling

by Suhui Liu, Chuanzhong Xuan, Zhaohui Tang, Guangpu Wang, Xinyu Gao and Zhipan Wang

Animals 2025, 15(15), 2299; https://doi.org/10.3390/ani15152299 - 6 Aug 2025

Precise recognition of individual ovine specimens plays a pivotal role in implementing smart agricultural platforms and optimizing herd management systems. With the development of deep learning technology, sheep face recognition provides an efficient and contactless solution for individual sheep identification. However, with the [...] Read more.

Precise recognition of individual ovine specimens plays a pivotal role in implementing smart agricultural platforms and optimizing herd management systems. With the development of deep learning technology, sheep face recognition provides an efficient and contactless solution for individual sheep identification. However, with the growth of sheep, their facial features keep changing, which poses challenges for existing sheep face recognition models to maintain accuracy across the dynamic changes in facial features over time, making it difficult to meet practical needs. To address this limitation, we propose the lifelong biometric learning of the sheep face network (LBL-SheepNet), a feature decoupling network designed for continuous adaptation to ovine facial changes, and constructed a dataset of 31,200 images from 55 sheep tracked monthly from 1 to 12 months of age. The LBL-SheepNet model addresses dynamic variations in facial features during sheep growth through a multi-module architectural framework. Firstly, a Squeeze-and-Excitation (SE) module enhances discriminative feature representation through adaptive channel-wise recalibration. Then, a nonlinear feature decoupling module employs a hybrid channel-batch attention mechanism to separate age-related features from identity-specific characteristics. Finally, a correlation analysis module utilizes adversarial learning to suppress age-biased feature interference, ensuring focus on age-invariant identifiers. Experimental results demonstrate that LBL-SheepNet achieves 95.5% identification accuracy and 95.3% average precision on the sheep face dataset. This study introduces a lifelong biometric learning (LBL) mechanism to mitigate recognition accuracy degradation caused by dynamic facial feature variations in growing sheep. By designing a feature decoupling network integrated with adversarial age-invariant learning, the proposed method addresses the performance limitations of existing models in long-term individual identification. Full article

(This article belongs to the Section Animal System and Management)

► Show Figures

Figure 1

21 pages, 12507 KiB

Open AccessArticle

Soil Amplification and Code Compliance: A Case Study of the 2023 Kahramanmaraş Earthquakes in Hayrullah Neighborhood

by Eyübhan Avcı, Kamil Bekir Afacan, Emre Deveci, Melih Uysal, Suna Altundaş and Mehmet Can Balcı

Buildings 2025, 15(15), 2746; https://doi.org/10.3390/buildings15152746 - 4 Aug 2025

Viewed by 245

Abstract

In the earthquakes that occurred in the Pazarcık (Mw = 7.7) and Elbistan (Mw = 7.6) districts of Kahramanmaraş Province on 6 February 2023, many buildings collapsed in the Hayrullah neighborhood of the Onikişubat district. In this study, we investigated whether there was [...] Read more.

In the earthquakes that occurred in the Pazarcık (Mw = 7.7) and Elbistan (Mw = 7.6) districts of Kahramanmaraş Province on 6 February 2023, many buildings collapsed in the Hayrullah neighborhood of the Onikişubat district. In this study, we investigated whether there was a soil amplification effect on the damage occurring in the Hayrullah neighborhood of the Onikişubat district of Kahramanmaraş Province. Firstly, borehole, SPT, MASW (multi-channel surface wave analysis), microtremor, electrical resistivity tomography (ERT), and vertical electrical sounding (VES) tests were carried out in the field to determine the engineering properties and behavior of soil. Laboratory tests were also conducted using samples obtained from bore holes and field tests. Then, an idealized soil profile was created using the laboratory and field test results, and site dynamic soil behavior analyses were performed on the extracted profile. According to The Turkish Building Code (TBC 2018), the earthquake level DD-2 design spectra of the project site were determined and the average design spectrum was created. Considering the seismicity of the project site and TBC (2018) criteria (according to site-specific faulting, distance, and average shear wave velocity), 11 earthquake ground motion sets were selected and harmonized with DD-2 spectra in short, medium, and long periods. Using scaled motions, the soil profile was excited with 22 different earthquake scenarios and the results were obtained for the equivalent and non-linear models. The analysis showed that the soft soil conditions in the area amplified ground shaking by up to 2.8 times, especially for longer periods (1.0–2.5 s). This level of amplification was consistent with the damage observed in mid- to high-rise buildings, highlighting the important role of local site effects in the structural losses seen during the Kahramanmaraş earthquakes. Full article

(This article belongs to the Section Building Structures)

► Show Figures

Figure 1

31 pages, 6206 KiB

Open AccessArticle

High-Redundancy Design and Application of Excitation Systems for Large Hydro-Generator Units Based on ATS and DDS

by Xiaodong Wang, Xiangtian Deng, Xuxin Yue, Haoran Wang, Xiaokun Li and Xuemin He

Electronics 2025, 14(15), 3013; https://doi.org/10.3390/electronics14153013 - 29 Jul 2025

Viewed by 260

Abstract

The large-scale integration of stochastic renewable energy sources necessitates enhanced dynamic balancing capabilities in power systems, positioning hydropower as a critical balancing asset. Conventional excitation systems utilizing hot-standby dual-redundancy configurations remain susceptible to unit shutdown events caused by regulator failures. To mitigate this [...] Read more.

The large-scale integration of stochastic renewable energy sources necessitates enhanced dynamic balancing capabilities in power systems, positioning hydropower as a critical balancing asset. Conventional excitation systems utilizing hot-standby dual-redundancy configurations remain susceptible to unit shutdown events caused by regulator failures. To mitigate this vulnerability, this study proposes a peer-to-peer distributed excitation architecture integrating asynchronous traffic shaping (ATS) and Data Distribution Service (DDS) technologies. This architecture utilizes control channels of equal priority and achieves high redundancy through cross-communication between discrete acquisition and computation modules. This research advances three key contributions: (1) design of a peer-to-peer distributed architectural framework; (2) development of a real-time data interaction methodology combining ATS and DDS, incorporating cross-layer parameter mapping, multi-priority queue scheduling, and congestion control mechanisms; (3) experimental validation of system reliability and redundancy through dynamic simulation. The results confirm the architecture’s operational efficacy, delivering both theoretical foundations and practical frameworks for highly reliable excitation systems. Full article

(This article belongs to the Special Issue Power Electronics in Renewable Systems)

► Show Figures

Figure 1

19 pages, 1711 KiB

Open AccessArticle

TSDCA-BA: An Ultra-Lightweight Speech Enhancement Model for Real-Time Hearing Aids with Multi-Scale STFT Fusion

by Zujie Fan, Zikun Guo, Yanxing Lai and Jaesoo Kim

Appl. Sci. 2025, 15(15), 8183; https://doi.org/10.3390/app15158183 - 23 Jul 2025

Viewed by 291

Abstract

Lightweight speech denoising models have made remarkable progress in improving both speech quality and computational efficiency. However, most models rely on long temporal windows as input, limiting their applicability in low-latency, real-time scenarios on edge devices. To address this challenge, we propose a [...] Read more.

Lightweight speech denoising models have made remarkable progress in improving both speech quality and computational efficiency. However, most models rely on long temporal windows as input, limiting their applicability in low-latency, real-time scenarios on edge devices. To address this challenge, we propose a lightweight hybrid module, Temporal Statistics Enhancement, Squeeze-and-Excitation-based Dual Convolutional Attention, and Band-wise Attention (TSE, SDCA, BA) Module. The TSE module enhances single-frame spectral features by concatenating statistical descriptors—mean, standard deviation, maximum, and minimum—thereby capturing richer local information without relying on temporal context. The SDCA and BA module integrates a simplified residual structure and channel attention, while the BA component further strengthens the representation of critical frequency bands through band-wise partitioning and differentiated weighting. The proposed model requires only 0.22 million multiply–accumulate operations (MMACs) and contains a total of 112.3 K parameters, making it well suited for low-latency, real-time speech enhancement applications. Experimental results demonstrate that among lightweight models with fewer than 200K parameters, the proposed approach outperforms most existing methods in both denoising performance and computational efficiency, significantly reducing processing overhead. Furthermore, real-device deployment on an improved hearing aid confirms an inference latency as low as 2 milliseconds, validating its practical potential for real-time edge applications. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

19 pages, 5417 KiB

Open AccessArticle

SE-TFF: Adaptive Tourism-Flow Forecasting Under Sparse and Heterogeneous Data via Multi-Scale SE-Net

by Jinyuan Zhang, Tao Cui and Peng He

Appl. Sci. 2025, 15(15), 8189; https://doi.org/10.3390/app15158189 - 23 Jul 2025

Viewed by 217

Abstract

Accurate and timely forecasting of cross-regional tourist flows is essential for sustainable destination management, yet existing models struggle with sparse data, complex spatiotemporal interactions, and limited interpretability. This paper presents SE-TFF, a multi-scale tourism-flow forecasting framework that couples a Squeeze-and-Excitation (SE) network with [...] Read more.

Accurate and timely forecasting of cross-regional tourist flows is essential for sustainable destination management, yet existing models struggle with sparse data, complex spatiotemporal interactions, and limited interpretability. This paper presents SE-TFF, a multi-scale tourism-flow forecasting framework that couples a Squeeze-and-Excitation (SE) network with reinforcement-driven optimization to adaptively re-weight environmental, economic, and social features. A benchmark dataset of 17.8 million records from 64 countries and 743 cities (2016–2024) is compiled from the Open Travel Data repository in github (OPTD) for training and validation. SE-TFF introduces (i) a multi-channel SE module for fine-grained feature selection under heterogeneous conditions, (ii) a Top-K attention filter to preserve salient context in highly sparse matrices, and (iii) a Double-DQN layer that dynamically balances prediction objectives. Experimental results show SE-TFF attains 56.5% MAE and 65.6% RMSE reductions over the best baseline (ARIMAX) at 20% sparsity, with 0.92 × 10³ average MAE across multi-task outputs. SHAP analysis ranks climate anomalies, tourism revenue, and employment as dominant predictors. These gains demonstrate SE-TFF’s ability to deliver real-time, interpretable forecasts for data-limited destinations. Future work will incorporate real-time social media signals and larger multimodal datasets to enhance generalizability. Full article

► Show Figures

Figure 1

15 pages, 1794 KiB

Open AccessArticle

Lightweight Dual-Attention Network for Concrete Crack Segmentation

by Min Feng and Juncai Xu

Sensors 2025, 25(14), 4436; https://doi.org/10.3390/s25144436 - 16 Jul 2025

Viewed by 331

Abstract

Structural health monitoring in resource-constrained environments demands crack segmentation models that match the accuracy of heavyweight convolutional networks while conforming to the power, memory, and latency limits of watt-level edge devices. This study presents a lightweight dual-attention network, which is a four-stage U-Net [...] Read more.

Structural health monitoring in resource-constrained environments demands crack segmentation models that match the accuracy of heavyweight convolutional networks while conforming to the power, memory, and latency limits of watt-level edge devices. This study presents a lightweight dual-attention network, which is a four-stage U-Net compressed to one-quarter of the channel depth and augmented—exclusively at the deepest layer—with a compact dual-attention block that couples channel excitation with spatial self-attention. The added mechanism increases computation by only 19%, limits the weight budget to 7.4 MB, and remains fully compatible with post-training INT8 quantization. On a pixel-labelled concrete crack benchmark, the proposed network achieves an intersection over union of 0.827 and an F1 score of 0.905, thus outperforming CrackTree, Hybrid 2020, MobileNetV3, and ESPNetv2. While refined weight initialization and Dice-augmented loss provide slight improvements, ablation experiments show that the dual-attention module is the main factor influencing accuracy. With 110 frames per second on a 10 W Jetson Nano and 220 frames per second on a 5 W Coral TPU achieved without observable accuracy loss, hardware-in-the-loop tests validate real-time viability. Thus, the proposed network offers cutting-edge crack segmentation at the kiloflop scale, thus facilitating ongoing, on-device civil infrastructure inspection. Full article

(This article belongs to the Special Issue Intelligent Sensor Technologies for Predictive Maintenance and Structural Health Monitoring)

► Show Figures

Figure 1

20 pages, 2132 KiB

Open AccessArticle

Deep Learning with Dual-Channel Feature Fusion for Epileptic EEG Signal Classification

by Bingbing Yu, Mingliang Zuo and Li Sui

Eng 2025, 6(7), 150; https://doi.org/10.3390/eng6070150 - 2 Jul 2025

Viewed by 401

Abstract

Background: Electroencephalography (EEG) signals play a crucial role in diagnosing epilepsy by reflecting distinct patterns associated with normal brain activity, ictal (seizure) states, and interictal (between-seizure) periods. However, the manual classification of these patterns is labor-intensive, time-consuming, and depends heavily on specialized expertise. [...] Read more.

Background: Electroencephalography (EEG) signals play a crucial role in diagnosing epilepsy by reflecting distinct patterns associated with normal brain activity, ictal (seizure) states, and interictal (between-seizure) periods. However, the manual classification of these patterns is labor-intensive, time-consuming, and depends heavily on specialized expertise. While deep learning methods have shown promise, many current models suffer from limitations such as excessive complexity, high computational demands, and insufficient generalizability. Developing lightweight and accurate models for real-time epilepsy detection remains a key challenge. Methods: This study proposes a novel dual-channel deep learning model to classify epileptic EEG signals into three categories: normal, ictal, and interictal states. Channel 1 integrates a bidirectional long short-term memory (BiLSTM) network with a Squeeze-and-Excitation (SE) ResNet attention module to dynamically emphasize critical feature channels. Channel 2 employs a dual-branch convolutional neural network (CNN) to extract deeper and distinct features. The model’s performance was evaluated on the publicly available Bonn EEG dataset. Results: The proposed model achieved an outstanding accuracy of 98.57%. The dual-channel structure improved specificity to 99.43%, while the dual-branch CNN boosted sensitivity by 5.12%. Components such as SE-ResNet attention modules contributed 4.29% to the accuracy improvement, and BiLSTM further enhanced specificity by 1.62%. Ablation studies validated the significance of each module. Conclusions: By leveraging a lightweight design and attention-based mechanisms, the dual-channel model offers high diagnostic precision while maintaining computational efficiency. Its applicability to real-time automated diagnosis positions it as a promising tool for clinical deployment across diverse patient populations. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence Techniques for Disease Prediction, Diagnosis and Management)

► Show Figures

Figure 1

20 pages, 760 KiB

Open AccessArticle

Detecting AI-Generated Images Using a Hybrid ResNet-SE Attention Model

by Abhilash Reddy Gunukula, Himel Das Gupta and Victor S. Sheng

Appl. Sci. 2025, 15(13), 7421; https://doi.org/10.3390/app15137421 - 2 Jul 2025

Viewed by 439

Abstract

The rapid advancements in generative artificial intelligence (AI), particularly through models like Generative Adversarial Networks (GANs) and diffusion-based architectures, have made it increasingly difficult to distinguish between real and synthetically generated images. While these technologies offer benefits in creative domains, they also pose [...] Read more.

The rapid advancements in generative artificial intelligence (AI), particularly through models like Generative Adversarial Networks (GANs) and diffusion-based architectures, have made it increasingly difficult to distinguish between real and synthetically generated images. While these technologies offer benefits in creative domains, they also pose serious risks in terms of misinformation, digital forgery, and identity manipulation. This paper presents a novel hybrid deep learning model for detecting AI-generated images by integrating the ResNet-50 architecture with Squeeze-and-Excitation (SE) attention blocks. The proposed SE-ResNet50 model enhances channel-wise feature recalibration and interpretability by integrating Squeeze-and-Excitation (SE) blocks into the ResNet-50 backbone, enabling dynamic emphasis on subtle generative artifacts such as unnatural textures and semantic inconsistencies, thereby improving classification fidelity. Experimental evaluation on the CIFAKE dataset demonstrates the model’s effectiveness, achieving a test accuracy of 96.12%, precision of 97.04%, recall of 88.94%, F1-score of 92.82%, and an AUC score of 0.9862. The model shows strong generalization, minimal overfitting, and superior performance compared with transformer-based models and standard architectures like ResNet-50, VGGNet, and DenseNet. These results confirm the hybrid model’s suitability for real-time and resource-constrained applications in media forensics, content authentication, and ethical AI governance. Full article

(This article belongs to the Special Issue Advanced Signal and Image Processing for Applied Engineering)

► Show Figures

Figure 1

26 pages, 10233 KiB

Open AccessArticle

Time-Series Forecasting Method Based on Hierarchical Spatio-Temporal Attention Mechanism

by Zhiguo Xiao, Junli Liu, Xinyao Cao, Ke Wang, Dongni Li and Qian Liu

Sensors 2025, 25(13), 4001; https://doi.org/10.3390/s25134001 - 26 Jun 2025

Viewed by 572

Abstract

In the field of intelligent decision-making, time-series data collected by sensors serves as the core carrier for interaction between the physical and digital worlds. Accurate analysis is the cornerstone of decision-making in critical scenarios, such as industrial monitoring and intelligent transportation. However, the [...] Read more.

In the field of intelligent decision-making, time-series data collected by sensors serves as the core carrier for interaction between the physical and digital worlds. Accurate analysis is the cornerstone of decision-making in critical scenarios, such as industrial monitoring and intelligent transportation. However, the inherent spatio-temporal coupling characteristics and cross-period long-range dependency of sensor data cause traditional time-series prediction methods to face performance bottlenecks in feature decoupling and multi-scale modeling. This study innovatively proposes a Spatio-Temporal Attention-Enhanced Network (TSEBG). Breaking through traditional structural designs, the model employs a Squeeze-and-Excitation Network (SENet) to reconstruct the convolutional layers of the Temporal Convolutional Network (TCN), strengthening the feature expression of key time steps through dynamic channel weight allocation to address the redundancy issue of traditional causal convolutions in local pattern capture. A Bidirectional Gated Recurrent Unit (BiGRU) variant based on a global attention mechanism is designed, leveraging the collaboration between gating units and attention weights to mine cross-period long-distance dependencies and effectively alleviate the gradient disappearance problem of Recurrent Neural Network (RNN-like) models in multi-scale time-series analysis. A hierarchical feature fusion architecture is constructed to achieve multi-dimensional alignment of local spatial and global temporal features. Through residual connections and the dynamic adjustment of attention weights, hierarchical semantic representations are output. Experiments show that TSEBG outperforms current dominant models in time-series single-step prediction tasks in terms of accuracy and performance, with a cross-dataset R² standard deviation of only 3.7%, demonstrating excellent generalization stability. It provides a novel theoretical framework for feature decoupling and multi-scale modeling of complex time-series data. Full article

(This article belongs to the Special Issue Intelligent Sensors for Condition Monitoring, Diagnosis, and Prognostics)

► Show Figures

Figure 1

35 pages, 16759 KiB

Open AccessArticle

A Commodity Recognition Model Under Multi-Size Lifting and Lowering Sampling

by Mengyuan Chen, Song Chen, Kai Xie, Bisheng Wu, Ziyu Qiu, Haofei Xu and Jianbiao He

Electronics 2025, 14(11), 2274; https://doi.org/10.3390/electronics14112274 - 2 Jun 2025

Viewed by 528

Abstract

Object detection algorithms have evolved from two-stage to single-stage architectures, with foundation models achieving sustained improvements in accuracy. However, in intelligent retail scenarios, small object detection and occlusion issues still lead to significant performance degradation. To address these challenges, this paper proposes an [...] Read more.

Object detection algorithms have evolved from two-stage to single-stage architectures, with foundation models achieving sustained improvements in accuracy. However, in intelligent retail scenarios, small object detection and occlusion issues still lead to significant performance degradation. To address these challenges, this paper proposes an improved model based on YOLOv11, focusing on resolving insufficient multi-scale feature coupling and occlusion sensitivity. First, a multi-scale feature extraction network (MFENet) is designed. It splits input feature maps into dual branches along the channel dimension: the upper branch performs local detail extraction and global semantic enhancement through secondary partitioning, while the lower branch integrates CARAFE (content-aware reassembly of features) upsampling and SENet (squeeze-and-excitation network) channel weight matrices to achieve adaptive feature enhancement. The three feature streams are fused to output multi-scale feature maps, significantly improving small object detail retention. Second, a convolutional block attention module (CBAM) is introduced during feature fusion, dynamically focusing on critical regions through channel–spatial dual attention mechanisms. A fuseModule is designed to aggregate multi-level features, enhancing contextual modeling for occluded objects. Additionally, the extreme-IoU (XIoU) loss function replaces the traditional complete-IoU (CIoU), combined with XIoU-NMS (extreme-IoU non-maximum suppression) to suppress redundant detections, optimizing convergence speed and localization accuracy. Experiments demonstrate that the improved model achieves a mean average precision (mAP50) of 0.997 (0.2% improvement) and mAP50-95 of 0.895 (3.5% improvement) on the RPC product dataset and the 6th Product Recognition Challenge dataset. The recall rate increases to 0.996 (0.6% improvement over baseline). Although frames per second (FPS) decreased compared to the original model, the improved model still meets real-time requirements for retail scenarios. The model exhibits stable noise resistance in challenging environments and achieves 84% mAP in cross-dataset testing, validating its generalization capability and engineering applicability. Video streams were captured using a Zhongweiaoke camera operating at 60 fps, satisfying real-time detection requirements for intelligent retail applications. Full article

(This article belongs to the Special Issue Emerging Technologies in Computational Intelligence)

► Show Figures

Figure 1

22 pages, 6392 KiB

Open AccessArticle

Dual-Phase Severity Grading of Strawberry Angular Leaf Spot Based on Improved YOLOv11 and OpenCV

by Yi-Xiao Xu, Xin-Hao Yu, Qing Yi, Qi-Yuan Zhang and Wen-Hao Su

Plants 2025, 14(11), 1656; https://doi.org/10.3390/plants14111656 - 29 May 2025

Viewed by 660

Abstract

Phyllosticta fragaricola-induced angular leaf spot causes substantial economic losses in global strawberry production, necessitating advanced severity assessment methods. This study proposed a dual-phase grading framework integrating deep learning and computer vision. The enhanced You Only Look Once version 11 (YOLOv11) architecture incorporated [...] Read more.

Phyllosticta fragaricola-induced angular leaf spot causes substantial economic losses in global strawberry production, necessitating advanced severity assessment methods. This study proposed a dual-phase grading framework integrating deep learning and computer vision. The enhanced You Only Look Once version 11 (YOLOv11) architecture incorporated a Content-Aware ReAssembly of FEatures (CARAFE) module for improved feature upsampling and a squeeze-and-excitation (SE) attention mechanism for channel-wise feature recalibration, resulting in the YOLOv11-CARAFE-SE for the severity assessment of strawberry angular leaf spot. Furthermore, an OpenCV-based threshold segmentation algorithm based on H-channel thresholds in the HSV color space achieved accurate lesion segmentation. A disease severity grading standard for strawberry angular leaf spot was established based on the ratio of lesion area to leaf area. In addition, specialized software for the assessment of disease severity was developed based on the improved YOLOv11-CARAFE-SE model and OpenCV-based algorithms. Experimental results show that compared with the baseline YOLOv11, the performance is significantly improved: the box mAP@0.5 is increased by 1.4% to 93.2%, the mask mAP@0.5 is increased by 0.9% to 93.0%, the inference time is shortened by 0.4 ms to 0.9 ms, and the computational load is reduced by 1.94% to 10.1 GFLOPS. In addition, this two-stage grading framework achieves an average accuracy of 94.2% in detecting selected strawberry horn leaf spot disease samples, providing real-time field diagnostics and a high-throughput phenotypic analysis for resistance breeding programs. This work demonstrates the feasibility of rapidly estimating the severity of strawberry horn leaf spot, which will establish a robust technical framework for strawberry disease management under field conditions. Full article

(This article belongs to the Section Crop Physiology and Crop Production)

► Show Figures

Figure 1

28 pages, 17488 KiB

Open AccessArticle

Attentive Multi-Scale Features with Adaptive Context PoseResNet for Resource-Efficient Human Pose Estimation

by Ali Zakir, Sartaj Ahmed Salman, Gibran Benitez-Garcia and Hiroki Takahashi

Electronics 2025, 14(11), 2107; https://doi.org/10.3390/electronics14112107 - 22 May 2025

Viewed by 578

Abstract

Human Pose Estimation (HPE) remains challenging due to scale variation, occlusion, and high computational costs. Standard methods often struggle to capture detailed spatial information when keypoints are obscured, and they typically rely on computationally expensive deconvolution layers for upsampling, making them inefficient for [...] Read more.

Human Pose Estimation (HPE) remains challenging due to scale variation, occlusion, and high computational costs. Standard methods often struggle to capture detailed spatial information when keypoints are obscured, and they typically rely on computationally expensive deconvolution layers for upsampling, making them inefficient for real-time or resource-constrained scenarios. We propose AMFACPose (Attentive Multi-scale Features with Adaptive Context PoseResNet) to address these limitations. Specifically, our architecture incorporates Coordinate Convolution 2D (CoordConv2d) to retain explicit spatial context, alleviating the loss of coordinate information in conventional convolutions. To reduce computational overhead while maintaining accuracy, we utilize Depthwise Separable Convolutions (DSCs), separating spatial and pointwise operations. At the core of our approach is an Adaptive Feature Pyramid Network (AFPN), which replaces costly deconvolution-based upsampling by efficiently aggregating multi-scale features to handle diverse human poses and body sizes. We further introduce Dual-Gate Context Blocks (DGCBs) that refine global context to manage partial occlusions and cluttered backgrounds. The model integrates Squeeze-and-Excitation (SE) blocks and the Spatial–Channel Refinement Module (SCRM) to emphasize the most informative feature channels and spatial regions, which is particularly beneficial for occluded or overlapping keypoints. For precise keypoint localization, we replace dense heatmap predictions with coordinate classification using Multi-Layer Perceptron (MLP) heads. Experiments on the COCO and CrowdPose datasets demonstrate that AMFACPose surpasses the existing 2D HPE methods in both accuracy and computational efficiency. Moreover, our implementation on edge devices achieves real-time performance while preserving high accuracy, confirming the suitability of AMFACPose for resource-constrained pose estimation in both benchmark and real-world environments. Full article

(This article belongs to the Special Issue Image Processing Based on Convolution Neural Network: 2nd Edition)

► Show Figures

Figure 1

14 pages, 938 KiB

Open AccessArticle

Gun–Bullet Model-Based Noncovalent Interactions Boosting Visible Light Photocatalytic Hydrogen Production in Poly Thieno[3,2-b]Thiophene/Graphitic Carbon Nitride Heterojunctions

by Yong Li, Jialu Tong, Zihao Chai, Yuanyuan Wu, Dongting Wang and Hongbin Li

Polymers 2025, 17(10), 1417; https://doi.org/10.3390/polym17101417 - 21 May 2025

Viewed by 352

Abstract

Linear conjugated polymer photocatalysts are still hampered by challenges involving low charge separation efficiency and poor water dispersibility, which are crucial factors during the photocatalytic water splitting process. Herein, we synthesized Poly thieno[3,2-b]thiophene (PTT) nanoparticles with excellent visible light response characteristic. Subsequently, we [...] Read more.

Linear conjugated polymer photocatalysts are still hampered by challenges involving low charge separation efficiency and poor water dispersibility, which are crucial factors during the photocatalytic water splitting process. Herein, we synthesized Poly thieno[3,2-b]thiophene (PTT) nanoparticles with excellent visible light response characteristic. Subsequently, we constructed the gun–bullet model PTT/graphitic carbon nitride (PTT/g-C₃N₄) heterojunctions for photocatalytic hydrogen production, where PTT with good visible light response characteristic serves as the bullets and g-C₃N₄ with good water dispersibility serves as the guns. The as-prepared PTT/g-C₃N₄ heterojunctions show greatly accelerated charge separation and excellent photocatalytic hydrogen production performance. Specifically, 10PTT/g-C₃N₄ demonstrates extraordinary hydrogen production performance, reaching 6.56 mmol g⁻¹ h⁻¹ (2 wt% Pt loading, 0.1 M AA as sacrificial agent, λ > 420 nm), calculated to be 15.3 and 22.6 times those of PTT and g-C₃N₄, respectively. Mechanistic studies reveal that the significantly improved performance of PTT/g-C₃N₄ heterojunctions is ascribed to the accelerated charge transfer, which originates from the C…S/N…S noncovalent interactions among PTT and g-C₃N₄. The C…S/N…S noncovalent interactions act as an efficient interface charge transmission channel (ICTC), accelerating the steady stream of excited electron transfer from the lowest unoccupied molecular orbital (LUMO) of PTT to that of g-C₃N₄. The gun–bullet model heterojunctions proposed here provide a practical strategy for achieving exceptional visible light photocatalytic hydrogen production by combining charge separation with water dispersibility in polymer/polymer heterojunctions via noncovalent interactions. Full article

(This article belongs to the Section Polymer Applications)

► Show Figures

Figure 1

19 pages, 11563 KiB

Open AccessArticle

Research on Concrete Crack and Depression Detection Method Based on Multi-Level Defect Fusion Segmentation Network

by Zhaochen Yao, Yanjuan Li, Hao Fu, Jun Tian, Yang Zhou, Chee-Loong Chin and Chau-Khun Ma

Buildings 2025, 15(10), 1657; https://doi.org/10.3390/buildings15101657 - 14 May 2025

Viewed by 499

Abstract

Cracks and dents in concrete structures are core defects that threaten building safety, but the existing YOLO series algorithms face a huge bottleneck in complex engineering scenarios. Tiny cracks are susceptible to background texture interference, leading to misjudgment. The traditional detection frame has [...] Read more.

Cracks and dents in concrete structures are core defects that threaten building safety, but the existing YOLO series algorithms face a huge bottleneck in complex engineering scenarios. Tiny cracks are susceptible to background texture interference, leading to misjudgment. The traditional detection frame has difficulty in accurately characterizing the dent geometry, which affects the quantitative damage assessment. In this paper, we propose a Multi-level Defect Fusion Segmentation Network (MDFNet) to break through the single-task limitation through the detection segmentation synergy framework. We improve the anchor frame strategy of YOLOv11 and enhance the recall of small targets by combining Copy–Pasting, and then enhance the pixel-level characterization of crack edges and dent contours by embedding the Head Attention-Expanded Convolutional Fusion Module (HAEConv) in U-Net with squeeze-and-excitation (SE) channel attention. Joint detection loss and segmentation loss are used for task co-optimization. On our self-constructed concrete defect dataset, MDFNet significantly outperforms the baseline model. In terms of accuracy, the MDFNet Dice coefficient is 92.4%, an improvement of 4.1 percentage points compared to YOLOv11-Seg. Our mean Intersection over Union (mIoU) reaches 81.6%, with strong generalization ability under complex background interference. In terms of engineering efficacy, the model achieves a processing speed of 45 frames per second (FPS) for 640 × 640 images, which is able to meet real-time monitoring requirements. The experimental results verify the feasibility of the model in the research field of crack and dent detection in concrete structures. Full article

(This article belongs to the Special Issue Advanced Research on Cementitious Composites for Construction)

► Show Figures

Figure 1

18 pages, 2731 KiB

Open AccessArticle

Prediction of Dissolved Gas in Transformer Oil Based on Variational Mode Decomposition Integrated with Long Short-Term Memory

by Guoping Chen, Jianhong Li, Yong Li, Xinming Hu, Jian Wang and Tao Li

Processes 2025, 13(5), 1446; https://doi.org/10.3390/pr13051446 - 9 May 2025

Viewed by 501

Abstract

To address the nonlinear and non-stationary characteristics of dissolved gas concentration data in transformer oil, this paper proposes a hybrid prediction model (VMD-SSA-LSTM-SE) that integrates Variational Mode Decomposition (VMD), the Whale Optimization Algorithm (WOA), the Sparrow Search Algorithm (SSA), Long Short-Term Memory (LSTM), [...] Read more.

To address the nonlinear and non-stationary characteristics of dissolved gas concentration data in transformer oil, this paper proposes a hybrid prediction model (VMD-SSA-LSTM-SE) that integrates Variational Mode Decomposition (VMD), the Whale Optimization Algorithm (WOA), the Sparrow Search Algorithm (SSA), Long Short-Term Memory (LSTM), and the Squeeze-and-Excitation (SE) attention mechanism. First, WOA dynamically optimizes VMD parameters (mode number k and penalty factor α to effectively separate noise and valid signals, avoiding modal aliasing). Then, SSA globally searches for optimal LSTM hyperparameters (hidden layer nodes, learning rate, etc.) to enhance feature mining for non-continuous data. The SE attention mechanism recalibrates channel-wise feature weights to capture critical time-series patterns. Experimental validation using real transformer oil data demonstrates that the model outperforms existing methods in prediction accuracy and computational efficiency. For instance, the CH₄ test set achieves a Mean Absolute Error (MAE) of 0.17996 μL/L, a Mean Absolute Percentage Error (MAPE) of 1.4423%, and an average runtime of 82.7 s, making it significantly faster than CEEMDAN-based models. These results provide robust technical support for transformer fault prediction and condition-based maintenance, highlighting the model’s effectiveness in handling non-stationary time-series data. Full article

(This article belongs to the Topic Recent Advances in Smart Grid and Energy Storage Applications)

► Show Figures

Figure 1

Search Results (266)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (266)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI