MDPI - Publisher of Open Access Journals

27 pages, 14010 KB

Open AccessArticle

A Novel Unsupervised Structural Damage Detection Method Based on TCN-GAT Autoencoder

by Yanchun Ni, Qiyuan Jin and Rui Hu

Sensors 2025, 25(21), 6724; https://doi.org/10.3390/s25216724 - 3 Nov 2025

Over the service life of several decades, structural damage detection is crucial for ensuring the safety and durability of engineering structures. However, existing methods often overlook the spatiotemporal coupling in multi-sensor data, hindering the full exploitation of structural dynamic evolution and spatial correlations. [...] Read more.

Over the service life of several decades, structural damage detection is crucial for ensuring the safety and durability of engineering structures. However, existing methods often overlook the spatiotemporal coupling in multi-sensor data, hindering the full exploitation of structural dynamic evolution and spatial correlations. This paper proposes an autoencoder model integrating Temporal Convolutional Networks (TCN) and Graph Attention Networks (GAT), termed TCNGAT-AE, to establish an unsupervised damage detection method. The model utilizes the TCN module to extract temporal dependencies and dynamic features from vibration signals, while leveraging the GAT module to explicitly capture the spatial topological relationships within the sensor network, thereby achieving deep fusion of spatiotemporal features. The proposed method adopts an “offline training-online detection” framework, requiring only data from the healthy state of the structure for training, and employs reconstruction error as the damage indicator. To validate the proposed method, two sets of experimentally measured data are utilized: one from the Z-24 concrete box-girder bridge under ambient excitation, and the other from the Old Ada Bridge under vehicle load excitation. Additionally, ablation studies are conducted to analyze the effectiveness of the spatiotemporal fusion mechanism. Results demonstrate that the proposed method achieves effective damage detection in both different structural types and excitation scenarios. Furthermore, the explicit modeling of spatiotemporal features significantly enhances detection performance, with the anomaly detection rate showing substantial improvement compared to baseline models utilizing only temporal or spatial modeling. Moreover, this end-to-end framework processes raw vibration signals directly, avoiding complex preprocessing. This makes it highly suitable for practical and near-real-time monitoring. The findings of this study demonstrate that the damage detection method based on TCNGAT-AE can be effectively applied to structural safety monitoring in complex engineering environments, and can be further integrated with real-time monitoring systems of critical structures for online analysis. Full article

(This article belongs to the Special Issue Women’s Special Issue Series: Sensors)

► Show Figures

Figure 1

14 pages, 981 KB

Open AccessFeature PaperArticle

What Does That Head Tilt Mean? Brain Lateralization and Sex Differences in the Processing of Familiar Human Speech by Domestic Dogs

by Colleen Buckley, Courtney L. Sexton, George Martvel, Erin E. Hecht, Brenda J. Bradley, Anna Zamansky and Francys Subiaul

Animals 2025, 15(21), 3179; https://doi.org/10.3390/ani15213179 - 31 Oct 2025

Viewed by 113

Abstract

Does the head tilt observed in many domesticated dogs index lateralized language processing? To answer this question, the present study evaluated household dogs responding to four conditions in which owners provided an increasing number of communicative cues. These cues ranged from no communicative/affective [...] Read more.

Does the head tilt observed in many domesticated dogs index lateralized language processing? To answer this question, the present study evaluated household dogs responding to four conditions in which owners provided an increasing number of communicative cues. These cues ranged from no communicative/affective cues to rich affective cues coupled with dog-directed speech. Dogs’ facial responses were first coded manually using the Dog Facial Action Coding System (DogFACS), followed by an in-depth investigation of head tilt behavior, in which AI-based automated analysis of head tilt and audio analysis of acoustic features extracted from communicative cues were implemented. In a sample of 103 dogs representing seven breed groups and mixed-breed dogs, we found significant differences in the number of head tilts occurring between conditions, with the most communicative (last) condition eliciting the most head tilts. There were also significant differences in the direction of the head tilts and between sex groups. Dogs were more likely to tilt their heads to the right, and neutered male dogs were more likely to tilt their heads than spayed female dogs. The right-tilt bias is consistent with left-hemisphere language processing in humans, with males processing language in a more lateralized manner, and females processing language more bilaterally—a pattern also observed in humans. Understanding the canine brain is important to both evolutionary research through a comparative lens, and in understanding our interspecies relationship. Full article

(This article belongs to the Section Human-Animal Interactions, Animal Behaviour and Emotion)

► Show Figures

Figure 1

20 pages, 918 KB

Open AccessArticle

MVIB-Lip: Multi-View Information Bottleneck for Visual Speech Recognition via Time Series Modeling

by Yuzhe Li, Haocheng Sun, Jiayi Cai and Jin Wu

Entropy 2025, 27(11), 1121; https://doi.org/10.3390/e27111121 - 31 Oct 2025

Viewed by 145

Abstract

Lipreading, or visual speech recognition, is the task of interpreting utterances solely from visual cues of lip movements. While early approaches relied on Hidden Markov Models (HMMs) and handcrafted spatiotemporal descriptors, recent advances in deep learning have enabled end-to-end recognition using large-scale datasets. [...] Read more.

Lipreading, or visual speech recognition, is the task of interpreting utterances solely from visual cues of lip movements. While early approaches relied on Hidden Markov Models (HMMs) and handcrafted spatiotemporal descriptors, recent advances in deep learning have enabled end-to-end recognition using large-scale datasets. However, such methods often require millions of labeled or pretraining samples and struggle to generalize under low-resource or speaker-independent conditions. In this work, we revisit lipreading from a multi-view learning perspective. We introduce MVIB-Lip, a framework that integrates two complementary representations of lip movements: (i) raw landmark trajectories modeled as multivariate time series, and (ii) recurrence plot (RP) images that encode structural dynamics in a texture form. A Transformer encoder processes the temporal sequences, while a ResNet-18 extracts features from RPs; the two views are fused via a product-of-experts posterior regularized by the multi-view information bottleneck. Experiments on the OuluVS and a self-collected dataset demonstrate that MVIB-Lip consistently outperforms handcrafted baselines and improves generalization to speaker-independent recognition. Our results suggest that recurrence plots, when coupled with deep multi-view learning, offer a principled and data-efficient path forward for robust visual speech recognition. Full article

(This article belongs to the Special Issue The Information Bottleneck Method: Theory and Applications)

► Show Figures

Figure 1

25 pages, 16046 KB

Open AccessArticle

UAV-Based Multimodal Monitoring of Tea Anthracnose with Temporal Standardization

by Qimeng Yu, Jingcheng Zhang, Lin Yuan, Xin Li, Fanguo Zeng, Ke Xu, Wenjiang Huang and Zhongting Shen

Agriculture 2025, 15(21), 2270; https://doi.org/10.3390/agriculture15212270 - 31 Oct 2025

Viewed by 191

Abstract

Tea Anthracnose (TA), caused by fungi of the genus Colletotrichum, is one of the major threats to global tea production. UAV remote sensing has been explored for non-destructive and high-efficiency monitoring of diseases in tea plantations. However, variations in illumination, background, and [...] Read more.

Tea Anthracnose (TA), caused by fungi of the genus Colletotrichum, is one of the major threats to global tea production. UAV remote sensing has been explored for non-destructive and high-efficiency monitoring of diseases in tea plantations. However, variations in illumination, background, and meteorological factors undermine the stability of cross-temporal data. Data processing and modeling complexity further limits model generalizability and practical application. This study introduced a cross-temporal, generalizable disease monitoring approach based on UAV multimodal data coupled with relative-difference standardization. In an experimental tea garden, we collected multispectral, thermal infrared, and RGB images and extracted four classes of features: spectral (Sp), thermal (Th), texture (Te), and color (Co). The Normalized Difference Vegetation Index (NDVI) was used to identify reference areas and standardize features, which significantly reduced the relative differences in cross-temporal features. Additionally, we developed a vegetation–soil relative temperature (VSRT) index, which exhibits higher temporal-phase consistency than the conventional normalized relative canopy temperature (NRCT). A multimodal optimal feature set was constructed through sensitivity analysis based on the four feature categories. For different modality combinations (single and fused), three machine learning algorithms, K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Multi-layer Perceptron (MLP), were selected to evaluate disease classification performance due to their low computational burden and ease of deployment. Results indicate that the “Sp + Th” combination achieved the highest accuracy (95.51%), with KNN (95.51%) outperforming SVM (94.23%) and MLP (92.95%). Moreover, under the optimal feature combination and KNN algorithm, the model achieved high generalizability (86.41%) on independent temporal data. This study demonstrates that fusing spectral and thermal features with temporal standardization, combined with the simple and effective KNN algorithm, achieves accurate and robust tea anthracnose monitoring, providing a practical solution for efficient and generalizable disease management in tea plantations. Full article

(This article belongs to the Section Crop Protection, Diseases, Pests and Weeds)

► Show Figures

Figure 1

22 pages, 2507 KB

Open AccessArticle

Analysis of Process Intensification Impact on Circular Economy in Levulinic Acid Purification Schemes

by Tania Itzel Serrano-Arévalo, Heriberto Alcocer-García, César Ramírez-Márquez and José María Ponce-Ortega

Processes 2025, 13(11), 3496; https://doi.org/10.3390/pr13113496 - 30 Oct 2025

Viewed by 286

Abstract

This study presents a comprehensive evaluation of levulinic acid purification schemes from a circular economy perspective, integrating resource-based indicators with economic and environmental metrics. Twelve alternatives, ranging from conventional distillation sequences to intensified hybrid systems, were assessed using indicators such as Relative Material [...] Read more.

This study presents a comprehensive evaluation of levulinic acid purification schemes from a circular economy perspective, integrating resource-based indicators with economic and environmental metrics. Twelve alternatives, ranging from conventional distillation sequences to intensified hybrid systems, were assessed using indicators such as Relative Material Impact, total annual cost, Eco-Indicator 99, fuel demand, and CO₂ emissions. The novelty of this work lies in extending the assessment beyond purification infrastructure to include upstream systems that supply energy demand, such as fuel extraction and steam generation. The configurations considered incorporate thermal couplings, dividing wall columns, and decanters, which influence energy efficiency, process complexity, and resource depletion. Among these, the TDWS-D configuration (Thermally Coupled Double Dividing Wall Column System with Decanter) exhibits the highest values in DMR, TAC, and CO₂ emissions, driven by its elevated energy demand and complex infrastructure. Conversely, the TCS2 configuration (Thermally Coupled Sequence, featuring selective heat integration between distillation columns) achieves the lowest impact across all metrics, demonstrating that selective and strategic intensification (rather than maximalist design) can yield superior sustainability outcomes. Across all scenarios, the boiler stage was identified as the main contributor to material depletion, followed by fuel extraction and purification equipment. Notably, some conventional designs proved superior to intensified ones in terms of circularity, challenging the assumption that intensification inherently guarantees sustainability. Overall, the integration of circular economy indicators enables a multidimensional evaluation framework that supports more responsible and resource-efficient process design. Full article

(This article belongs to the Special Issue Modeling, Simulation and Control in Energy Systems—2nd Edition)

► Show Figures

Figure 1

37 pages, 25662 KB

Open AccessArticle

A Hyperspectral Remote Sensing Image Encryption Algorithm Based on a Novel Two-Dimensional Hyperchaotic Map

by Zongyue Bai, Qingzhan Zhao, Wenzhong Tian, Xuewen Wang, Jingyang Li and Yuzhen Wu

Entropy 2025, 27(11), 1117; https://doi.org/10.3390/e27111117 - 30 Oct 2025

Viewed by 99

Abstract

With the rapid advancement of hyperspectral remote sensing technology, the security of hyperspectral images (HSIs) has become a critical concern. However, traditional image encryption methods—designed primarily for grayscale or RGB images—fail to address the high dimensionality, large data volume, and spectral-domain characteristics inherent [...] Read more.

With the rapid advancement of hyperspectral remote sensing technology, the security of hyperspectral images (HSIs) has become a critical concern. However, traditional image encryption methods—designed primarily for grayscale or RGB images—fail to address the high dimensionality, large data volume, and spectral-domain characteristics inherent to HSIs. Existing chaotic encryption schemes often suffer from limited chaotic performance, narrow parameter ranges, and inadequate spectral protection, leaving HSIs vulnerable to spectral feature extraction and statistical attacks. To overcome these limitations, this paper proposes a novel hyperspectral image encryption algorithm based on a newly designed two-dimensional cross-coupled hyperchaotic map (2D-CSCM), which synergistically integrates Cubic, Sinusoidal, and Chebyshev maps. The 2D-CSCM exhibits superior hyperchaotic behavior, including a wider hyperchaotic parameter range, enhanced randomness, and higher complexity, as validated by Lyapunov exponents, sample entropy, and NIST tests. Building on this, a layered encryption framework is introduced: spectral-band scrambling to conceal spectral curves while preserving spatial structure, spatial pixel permutation to disrupt correlation, and a bit-level diffusion mechanism based on dynamic DNA encoding, specifically designed to secure high bit-depth digital number (DN) values (typically >8 bits). Experimental results on multiple HSI datasets demonstrate that the proposed algorithm achieves near-ideal information entropy (up to 15.8107 for 16-bit data), negligible adjacent-pixel correlation (below 0.01), and strong resistance to statistical, cropping, and differential attacks (NPCR ≈ 99.998%, UACI ≈ 33.30%). The algorithm not only ensures comprehensive encryption of both spectral and spatial information but also supports lossless decryption, offering a robust and practical solution for secure storage and transmission of hyperspectral remote sensing imagery. Full article

(This article belongs to the Section Signal and Data Analysis)

► Show Figures

Figure 1

15 pages, 3140 KB

Open AccessSystematic Review

Systematic Review of Line-Field Confocal Optical Coherence Tomography for Diagnosing Pre-Malignant and Malignant Keratinocytic Lesions: Optimising the Workflow

by Maria Luísa Santos e Silva Caldeira Marques, Justin Hero, Mary-Ann el-Sharouni, Marta García Bustínduy and Pascale Guitera

Diagnostics 2025, 15(21), 2746; https://doi.org/10.3390/diagnostics15212746 - 29 Oct 2025

Viewed by 306

Abstract

Background: Line-field confocal optical coherence tomography (LC-OCT) is a non-invasive imaging technique providing high-resolution en-face and cross-sectional views of the epidermis and superficial dermis for in vivo characterisation of actinic keratosis (AK), Bowen’s disease (BD) and squamous cell carcinoma (SCC). Despite its [...] Read more.

Background: Line-field confocal optical coherence tomography (LC-OCT) is a non-invasive imaging technique providing high-resolution en-face and cross-sectional views of the epidermis and superficial dermis for in vivo characterisation of actinic keratosis (AK), Bowen’s disease (BD) and squamous cell carcinoma (SCC). Despite its promise, standardised imaging protocols are lacking. Objective: This systematic review aims to assess the utility of LC-OCT for diagnosing AK, BD and SCC, with particular emphasis on workflow optimisation and protocol standardisation. Methods: A systematic literature search was performed using PubMed, Embase, and Scopus databases (January 2018–October 2024). Two reviewers independently screened the records, extracted data and applied the Confidence in the Evidence from Reviews of Qualitative research (CERQual) framework to assess confidence in key findings. Results: Eleven studies met the inclusion criteria. LC-OCT reliably identified key histopathological correlates. Across studies, LC-OCT consistently visualised hyperkeratosis, keratinocytic atypia, parakeratosis, and acanthosis, as well as characteristic vascular alterations and dermal remodeling. LC-OCT also demonstrated its capacity to detect invasive features by revealing disruptions in the dermo-epidermal junction and the presence of tumour strands infiltrating the dermis. Multimodal imaging combined with technical optimisations such as minimal probe pressure, paraffin oil coupling, and dermoscopy-guided localisation, substantially improved image resolution and interobserver concordance. Conclusions: This systematic review provides a basis for establishing standardised LC-OCT imaging protocols in keratinocytic tumours. While LC-OCT shows promise as a non-invasive diagnostic tool, further multicenter studies are needed to refine imaging workflows and evaluate the integration of artificial intelligence-based analysis to improve diagnostic accuracy and reproducibility. Full article

(This article belongs to the Section Biomedical Optics)

► Show Figures

Graphical abstract

18 pages, 4640 KB

Open AccessArticle

Cable Outer Sheath Defect Identification Using Multi-Scale Leakage Current Features and Graph Neural Networks

by Musong Lin, Hankun Wei, Xukai Duan, Zhi Li, Qiang Fu and Yong Liu

Energies 2025, 18(21), 5687; https://doi.org/10.3390/en18215687 - 29 Oct 2025

Viewed by 156

Abstract

The outer sheath of power cables is prone to mechanical damage and environmental stress during long-term operation, and early defects are often difficult to detect accurately using conventional methods. To address this challenge, this paper proposes an outer sheath defect identification method based [...] Read more.

The outer sheath of power cables is prone to mechanical damage and environmental stress during long-term operation, and early defects are often difficult to detect accurately using conventional methods. To address this challenge, this paper proposes an outer sheath defect identification method based on leakage current features and graph neural networks. An electro–thermal coupling physical model was first proposed to simulate the electric field distribution and thermal effects under typical defects, thereby revealing the mechanisms by which defects influence leakage current and harmonic components. A power-frequency high-voltage experimental platform was then constructed to collect leakage current signals under conditions such as scratches, indentations, moisture, and chemical corrosion. Multi-scale frequency band features were extracted using wavelet packet decomposition to construct correlation graphs, which were further modeled through a combination of graph convolutional networks and long short-term memory networks for spatiotemporal analysis. Experimental results demonstrate that the proposed method effectively improves defect type and severity identification. By integrating physical mechanism analysis with data-driven modeling, this approach provides a feasible pathway for condition monitoring and refined operation and maintenance of cable outer sheaths. Full article

► Show Figures

Figure 1

17 pages, 2940 KB

Open AccessArticle

Integrated Energy Short-Term Adaptive Load Forecasting Method Based on Coupled Feature Extraction

by Yidan Qin, Bonan Huang, Luyuan Wang, Jiaqi Tian and Yameng Zhang

Information 2025, 16(11), 940; https://doi.org/10.3390/info16110940 - 29 Oct 2025

Viewed by 144

Abstract

Integrated energy load forecasting plays a crucial role in optimizing the operation and economic dispatch of integrated energy systems. Its forecasting accuracy is not only time-dependent but also influenced by the coupling characteristics among energy sources. Solely relying on time-scale training methods cannot [...] Read more.

Integrated energy load forecasting plays a crucial role in optimizing the operation and economic dispatch of integrated energy systems. Its forecasting accuracy is not only time-dependent but also influenced by the coupling characteristics among energy sources. Solely relying on time-scale training methods cannot adequately capture the strong correlations among multiple energy sources. To address challenges in extracting coupled load forecasting features, obtaining periodic characteristics, and setting model network structures, this paper proposes an Integrated Energy Short-Term Adaptive Load Forecasting Method Based on Coupled Feature Extraction (AP-CFE). This approach integrates high-dimensional coupling features and periodic temporal features effectively using ensemble algorithms. To prevent overfitting or underfitting issues, an Adaptive learning algorithm (AP) is introduced. The load demonstrates highly stochastic behavior in response to external factors, resulting in rapid, volatile fluctuations in grid demand. The strategy of employing sparse self-attention to approximate the residual terms effectively mitigates this issue. Simulation results using comprehensive energy load data from Australia demonstrate that the proposed model outperforms existing models, achieving better capture of energy coupling characteristics with average absolute percentage errors reduced by 20.75%, 28.48%, and 21.64% for electricity, heat, and gas loads, respectively. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Graphical abstract

18 pages, 3124 KB

Open AccessArticle

Frequency-Mode Study of Piezoelectric Devices for Non-Invasive Optical Activation

by Armando Josué Piña-Díaz, Leonardo Castillo-Tobar, Donatila Milachay-Montero, Emigdio Chavez-Angel, Roberto Villarroel and José Antonio García-Merino

Nanomaterials 2025, 15(21), 1650; https://doi.org/10.3390/nano15211650 - 29 Oct 2025

Viewed by 292

Abstract

Piezoelectric materials are fundamental elements in modern science and technology due to their unique ability to convert mechanical and electrical energy bidirectionally. They are widely employed in sensors, actuators, and energy-harvesting systems. In this work, we investigate the behavior of commercial lead zirconate [...] Read more.

Piezoelectric materials are fundamental elements in modern science and technology due to their unique ability to convert mechanical and electrical energy bidirectionally. They are widely employed in sensors, actuators, and energy-harvesting systems. In this work, we investigate the behavior of commercial lead zirconate titanate (PZT) sensors under frequency-mode excitation using a combined approach of impedance spectroscopy and optical interferometry. The impedance spectra reveal distinct resonance–antiresonance features that strongly depend on geometry, while interferometric measurements capture dynamic strain fields through fringe displacement analysis. The strongest deformation occurs near the first kilohertz resonance, directly correlated with the impedance phase, enabling the extraction of an effective piezoelectric constant (~40 pC/N). Moving beyond the linear regime, laser-induced excitation demonstrates optically driven activation of piezoelectric modes, with a frequency-dependent response and nonlinear scaling with optical power, characteristic of coupled pyroelectric–piezoelectric effects. These findings introduce a frequency-mode approach that combines impedance spectroscopy and optical interferometry to simultaneously probe electrical and mechanical responses in a single setup, enabling non-contact, frequency-selective sensing without surface modification or complex optical alignment. Although focused on macroscale ceramic PZTs, the non-contact measurement and activation strategies presented here offer scalable tools for informing the design and analysis of piezoelectric behavior in micro- and nanoscale systems. Such frequency-resolved, optical-access approaches are particularly valuable in the development of next-generation nanosensors, MEMS/NEMS devices, and optoelectronic interfaces where direct electrical probing is challenging or invasive. Full article

(This article belongs to the Special Issue Thermal, Electrical and Thermoelectric Properties of Nanomaterials and Their Applications)

► Show Figures

Graphical abstract

24 pages, 2761 KB

Open AccessArticle

An Explainable AI Framework for Corneal Imaging Interpretation and Refractive Surgery Decision Support

by Mini Han Wang

Bioengineering 2025, 12(11), 1174; https://doi.org/10.3390/bioengineering12111174 - 28 Oct 2025

Viewed by 460

Abstract

This study introduces an explainable neuro-symbolic and large language model (LLM)-driven framework for intelligent interpretation of corneal topography and precision surgical decision support. In a prospective cohort of 20 eyes, comprehensive IOLMaster 700 reports were analyzed through a four-stage pipeline: (1) automated extraction [...] Read more.

This study introduces an explainable neuro-symbolic and large language model (LLM)-driven framework for intelligent interpretation of corneal topography and precision surgical decision support. In a prospective cohort of 20 eyes, comprehensive IOLMaster 700 reports were analyzed through a four-stage pipeline: (1) automated extraction of key parameters—including corneal curvature, pachymetry, and axial biometry; (2) mapping of these quantitative features onto a curated corneal disease and refractive-surgery knowledge graph; (3) Bayesian probabilistic inference to evaluate early keratoconus and surgical eligibility; and (4) explainable multi-model LLM reporting, employing DeepSeek and GPT-4.0, to generate bilingual physician- and patient-facing narratives. By transforming complex imaging data into transparent reasoning chains, the pipeline delivered case-level outputs within ~95 ± 12 s. When benchmarked against independent evaluations by two senior corneal specialists, the framework achieved 92 ± 4% sensitivity, 94 ± 5% specificity, 93 ± 4% accuracy, and an AUC of 0.95 ± 0.03 for early keratoconus detection, alongside an F1 score of 0.90 ± 0.04 for refractive surgery eligibility. The generated bilingual reports were rated ≥4.8/5 for logical clarity, clinical usefulness, and comprehensibility, with representative cases fully concordant with expert judgment. Comparative benchmarking against baseline CNN and ViT models demonstrated superior diagnostic accuracy (AUC = 0.95 ± 0.03 vs. 0.88 and 0.90, p < 0.05), confirming the added value of the neuro-symbolic reasoning layer. All analyses were executed on a workstation equipped with an NVIDIA RTX 4090 GPU and implemented in Python 3.10/PyTorch 2.2.1 for full reproducibility. By explicitly coupling symbolic medical knowledge with advanced language models and embedding explainable artificial intelligence (XAI) principles throughout data processing, reasoning, and reporting, this framework provides a transparent, rapid, and clinically actionable AI solution. The approach holds significant promise for improving early ectatic disease detection and supporting individualized refractive surgery planning in routine ophthalmic practice. Full article

(This article belongs to the Special Issue Bioengineering and the Eye—3rd Edition)

► Show Figures

Figure 1

25 pages, 2392 KB

Open AccessArticle

Causal Intervention and Counterfactual Reasoning for Multimodal Pedestrian Trajectory Prediction

by Xinyu Han and Huosheng Xu

J. Imaging 2025, 11(11), 379; https://doi.org/10.3390/jimaging11110379 - 28 Oct 2025

Viewed by 197

Abstract

Pedestrian trajectory prediction is crucial for autonomous systems navigating human-populated environments. However, existing methods face fundamental challenges including spurious correlations induced by confounding social environments, passive uncertainty modeling that limits prediction diversity, and bias coupling during feature interaction that contaminates trajectory representations. To [...] Read more.

Pedestrian trajectory prediction is crucial for autonomous systems navigating human-populated environments. However, existing methods face fundamental challenges including spurious correlations induced by confounding social environments, passive uncertainty modeling that limits prediction diversity, and bias coupling during feature interaction that contaminates trajectory representations. To address these issues, we propose a novel Causal Intervention and Counterfactual Reasoning (CICR) framework that shifts trajectory prediction from associative learning to a causal inference paradigm. Our approach features a hierarchical architecture having three core components: a Multisource Encoder that extracts comprehensive spatio-temporal and social context features; a Causal Intervention Fusion Module that eliminates confounding bias through the front-door criterion and cross-attention mechanisms; and a Counterfactual Reasoning Decoder that proactively generates diverse future trajectories by simulating hypothetical scenarios. Extensive experiments on the ETH/UCY, SDD, and AVD datasets demonstrate superior performance, achieving an average ADE/FDE of 0.17/0.24 on ETH/UCY and 7.13/10.29 on SDD, with particular advantages in long-term prediction and cross-domain generalization. Full article

(This article belongs to the Special Issue Advances in Machine Learning for Computer Vision Applications)

► Show Figures

Figure 1

25 pages, 18310 KB

Open AccessArticle

A Multimodal Fusion Method for Weld Seam Extraction Under Arc Light and Fume Interference

by Lei Cai and Han Zhao

J. Manuf. Mater. Process. 2025, 9(11), 350; https://doi.org/10.3390/jmmp9110350 - 26 Oct 2025

Viewed by 247

Abstract

During the Gas Metal Arc Welding (GMAW) process, intense arc light and dense fumes cause local overexposure in RGB images and data loss in point clouds, which severely compromises the extraction accuracy of circular closed-curve weld seams. To address this challenge, this paper [...] Read more.

During the Gas Metal Arc Welding (GMAW) process, intense arc light and dense fumes cause local overexposure in RGB images and data loss in point clouds, which severely compromises the extraction accuracy of circular closed-curve weld seams. To address this challenge, this paper proposes a multimodal fusion method for weld seam extraction under arc light and fume interference. The method begins by constructing a weld seam edge feature extraction (WSEF) module based on a synergistic fusion network, which achieves precise localization of the weld contour by coupling image arc light-removal and semantic segmentation tasks. Subsequently, an image-to-point cloud mapping-guided Local Point Cloud Feature extraction (LPCF) module was designed, incorporating the Shuffle Attention mechanism to enhance robustness against noise and occlusion. Building upon this, a cross-modal attention-driven multimodal feature fusion (MFF) module integrates 2D edge features with 3D structural information to generate a spatially consistent and detail-rich fused point cloud. Finally, a hierarchical trajectory reconstruction and smoothing method is employed to achieve high-precision reconstruction of the closed weld seam path. The experimental results demonstrate that under severe arc light and fume interference, the proposed method achieves a Root Mean Square Error below 0.6 mm, a maximum error not exceeding 1.2 mm, and a processing time under 5 s. Its performance significantly surpasses that of existing methods, showcasing excellent accuracy and robustness. Full article

► Show Figures

Figure 1

21 pages, 3381 KB

Open AccessArticle

Aero-Engine Ablation Defect Detection with Improved CLR-YOLOv11 Algorithm

by Yi Liu, Jiatian Liu, Yaxi Xu, Qiang Fu, Jide Qian and Xin Wang

Sensors 2025, 25(21), 6574; https://doi.org/10.3390/s25216574 - 25 Oct 2025

Viewed by 521

Abstract

Aero-engine ablation detection is a critical task in aircraft health management, yet existing rotation-based object detection methods often face challenges of high computational complexity and insufficient local feature extraction. This paper proposes an improved YOLOv11 algorithm incorporating Context-guided Large-kernel attention and Rotated detection [...] Read more.

Aero-engine ablation detection is a critical task in aircraft health management, yet existing rotation-based object detection methods often face challenges of high computational complexity and insufficient local feature extraction. This paper proposes an improved YOLOv11 algorithm incorporating Context-guided Large-kernel attention and Rotated detection head, called CLR-YOLOv11. The model achieves synergistic improvement in both detection efficiency and accuracy through dual structural optimization, with its innovations primarily embodied in the following three tightly coupled strategies: (1) Targeted Data Preprocessing Pipeline Design: To address challenges such as limited sample size, low overall image brightness, and noise interference, we designed an ordered data augmentation and normalization pipeline. This pipeline is not a mere stacking of techniques but strategically enhances sample diversity through geometric transformations (random flipping, rotation), hybrid augmentations (Mixup, Mosaic), and pixel-value transformations (histogram equalization, Gaussian filtering). All processed images subsequently undergo Z-Score normalization. This order-aware pipeline design effectively improves the quality, diversity, and consistency of the input data. (2) Context-Guided Feature Fusion Mechanism: To overcome the limitations of traditional Convolutional Neural Networks in modeling long-range contextual dependencies between ablation areas and surrounding structures, we replaced the original C3k2 layer with the C3K2CG module. This module adaptively fuses local textural details with global semantic information through a context-guided mechanism, enabling the model to more accurately understand the gradual boundaries and spatial context of ablation regions. (3) Efficiency-Oriented Large-Kernel Attention Optimization: To expand the receptive field while strictly controlling the additional computational overhead introduced by rotated detection, we replaced the C2PSA module with the C2PSLA module. By employing large-kernel decomposition and a spatial selective focusing strategy, this module significantly reduces computational load while maintaining multi-scale feature perception capability, ensuring the model meets the demands of high real-time applications. Experiments on a self-built aero-engine ablation dataset demonstrate that the improved model achieves 78.5% mAP@0.5:0.95, representing a 4.2% improvement over the YOLOv11-obb which model without the specialized data augmentation. This study provides an effective solution for high-precision real-time aviation inspection tasks. Full article

(This article belongs to the Special Issue Advanced Neural Architectures for Anomaly Detection in Sensory Data)

► Show Figures

Figure 1

21 pages, 5551 KB

Open AccessArticle

Magnetically Coupled Free Piston Stirling Generator for Low Temperature Thermal Energy Extraction Using Ocean as Heat Sink

by Hao Tian, Zezhong Gao and Yongjun Gong

J. Mar. Sci. Eng. 2025, 13(11), 2046; https://doi.org/10.3390/jmse13112046 - 25 Oct 2025

Viewed by 256

Abstract

The ocean, as one of the largest thermal energy storage bodies on earth, has great potential as a thermal-electric energy reserve. Application of the relatively fixed-temperature ocean as the heat sink, and using concentrated solar energy as the heat source, one may construct [...] Read more.

The ocean, as one of the largest thermal energy storage bodies on earth, has great potential as a thermal-electric energy reserve. Application of the relatively fixed-temperature ocean as the heat sink, and using concentrated solar energy as the heat source, one may construct a mobile power station on the ocean’s surface. However, a traditional solar-based heat source requires a large footprint to concentrate the light beam, resulting in bulky parabolic dishes, which are impractical under ocean engineering scenarios. For buoy-sized applications, the small form factor of the energy collector can only achieve limited temperature differential, and its energy quality is deemed to be unusable by traditional spring-loaded free piston Stirling engines. Facing these challenges, a low-temperature differential free piston Stirling engine is presented. The engine features a large displacer piston (ϕ136, 5 mm thick) made of corrugated board, and an aluminum power piston (ϕ10). Permanent magnets embedded in both pistons couple them through magnetic attraction rather than a mechanical spring. This magnetic “spring” delivers an inverse-exponential force–distance relation: weak attraction at large separations minimizes damping, while strong attraction at small separations efficiently transfers kinetic energy from the displacer to the power piston. Engine dynamics are captured by a lumped-parameter model implemented in Simulink, with key magnetic parameters extracted from finite-element analysis. Initial results have shown that the laboratory prototype can operate continuously across heater-to-cooler temperature differences of 58–84 K, sustaining flywheel speeds of 258–324 RPM. Full article

(This article belongs to the Section Marine Energy)

► Show Figures

Figure 1

Search Results (723)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (723)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI