Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (163)

Search Parameters:
Keywords = temporal segmentation transformer

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 2148 KB  
Article
Infrared Moving Maritime Vessel Segmentation Based on Multi-Scale Spatial–Temporal Transformer Network
by Wenhui Liu, Yulong Qiao, Yue Zhao and Zhengyi Xing
Remote Sens. 2026, 18(7), 1006; https://doi.org/10.3390/rs18071006 - 27 Mar 2026
Viewed by 144
Abstract
Infrared moving maritime vessel segmentation is a crucial image processing task for maritime security, which is a challenging problem due to the complex backgrounds and targets with varying sizes. To address these issues, we propose an end-to-end segmentation network based on a multi-scale [...] Read more.
Infrared moving maritime vessel segmentation is a crucial image processing task for maritime security, which is a challenging problem due to the complex backgrounds and targets with varying sizes. To address these issues, we propose an end-to-end segmentation network based on a multi-scale spatiotemporal vision transformer (ST-VT) for segmenting the moving maritime vessels in the infrared image sequence. Specifically, in the feature extraction module, we introduce a multi-scale feature encoding structure that combines a multi-scale backbone and Feature Pyramid Network technology. Then, the multi-scale deformable encoder structure and a cross-scale fusion module with the pixel decoder are proposed to generate the multi-scale spatiotemporal features. Subsequently, we employ the improved attention blocks that are the core blocks of the coarse-to-fine framework (across scales) of the prompt decoder to obtain the prompts. Finally, a multi-scale mask decoder is applied to achieve the final target segmentation. The experiments are conducted on the benchmark dataset IPATCH and our labeled dataset LAS-MassMIND. The results demonstrate that the proposed method achieves state-of-the-art performance, especially within complex backgrounds and targets of varying sizes. Full article
Show Figures

Figure 1

12 pages, 873 KB  
Article
Anatomy-Specific Association of Circulating Sortilin with Proximal Left Anterior Descending Artery Obstruction
by Alim Namitokov, Irina Gilevich, Olga Malyarevskaya, Natalia Iraklionova, Karina Karabakhtsieva and Dana Namitokova
Cardiovasc. Med. 2026, 29(2), 13; https://doi.org/10.3390/cardiovascmed29020013 - 25 Mar 2026
Viewed by 157
Abstract
Background: Sortilin (SORT1), linked to the 1p13.3 coronary risk locus, is implicated in lipid trafficking and atherogenesis; however, clinical studies of circulating SORT1 have produced inconsistent results. We evaluated whether circulating SORT1 is associated with angiographic burden and lesion localization in patients with [...] Read more.
Background: Sortilin (SORT1), linked to the 1p13.3 coronary risk locus, is implicated in lipid trafficking and atherogenesis; however, clinical studies of circulating SORT1 have produced inconsistent results. We evaluated whether circulating SORT1 is associated with angiographic burden and lesion localization in patients with premature or early clinical debut coronary atherosclerosis. Methods: This single-center, cross-sectional study analyzed a dataset collected from January to May 2023. Participants were classified as coronary atherosclerosis cases if the dataset contained an age of clinical debut of clinically significant atherosclerosis (n = 101). Controls had no recorded debut age and 0% stenosis in all assessed coronary segments (n = 27). Blood was collected in clot activator tubes; serum was stored at −40 °C until analysis. SORT1 (ng/mL) was measured using an Aviscera Bioscience ELISA. Coronary stenoses were recorded as percent diameter stenosis for left main (LM), proximal/mid/distal LAD, proximal/mid/distal LCx, and proximal/mid/distal RCA. Burden metrics included the number of segments with any stenosis (>0%), the number of obstructive segments (≥50%), the number of diseased vessels, and maximum stenosis. The prespecified primary endpoint was obstructive proximal LAD stenosis (≥50%). Nonparametric tests and Spearman correlations were used. Logistic regression evaluated the association between log2-transformed SORT1 and proximal LAD obstruction, adjusted for age, sex, LDL-C, statin use, and smoking/diabetes/hypertension durations. Results: SORT1 was higher in cases than controls (8.60 [2.60–17.10] vs. 2.30 [1.25–10.65] ng/mL; p = 0.0058). Within cases, SORT1 did not correlate with global angiographic burden (any-stenosis segments: ρ = −0.066, p = 0.513; obstructive segments: ρ = −0.060, p = 0.552; diseased vessels: ρ = −0.045, p = 0.652; maximum stenosis: ρ = −0.084, p = 0.403). Obstructive proximal LAD stenosis occurred in 44/101 (43.6%) and was associated with higher SORT1 (12.25 [4.18–17.45] vs. 4.10 [2.20–11.60] ng/mL; p = 0.0093). Each doubling of SORT1 was independently associated with proximal LAD obstruction (adjusted OR 1.48, 95% CI 1.12–1.95; p = 0.005). Conclusions: In this cross-sectional cohort, circulating SORT1 was associated with obstructive proximal LAD stenosis but not with global angiographic burden metrics. These findings are hypothesis-generating and warrant validation in independent cohorts with standardized preanalytics and prospective designs to assess temporal relationships and clinical utility. Full article
Show Figures

Graphical abstract

25 pages, 913 KB  
Article
Multi-Scale Spatiotemporal Fusion and Steady-State Memory-Driven Load Forecasting for Integrated Energy Systems
by Yong Liang, Lin Bao, Xiaoyan Sun and Junping Tang
Information 2026, 17(3), 309; https://doi.org/10.3390/info17030309 - 23 Mar 2026
Viewed by 219
Abstract
Load forecasting for Integrated Energy Systems (IESs) is critical to enabling multi-energy coordinated optimization and low-carbon scheduling. Facing multi-load types and multi-site high-dimensional heterogeneous data, there remains a global learning challenge stemming from insufficient representation of spatiotemporal coupling features. In response to the [...] Read more.
Load forecasting for Integrated Energy Systems (IESs) is critical to enabling multi-energy coordinated optimization and low-carbon scheduling. Facing multi-load types and multi-site high-dimensional heterogeneous data, there remains a global learning challenge stemming from insufficient representation of spatiotemporal coupling features. In response to the multi-source heterogeneous characteristics of IES loads, this paper designs a Spatiotemporal Topology Encoder that maps load data into a tensorized multi-energy spatiotemporal topological representation via fuzzy classification and multi-scale ranking. In parallel, we construct a MultiScale Hybrid Convolver to extract multi-scale, multi-level global spatiotemporal features of multi-energy load representations. We further develop a Temporal Segmentation Transformer and a Steady-State Exponentially Gated Memory Unit, and design a jointly optimized forecasting model that enforces global dynamic correlations and local, steady-state preservation. Altogether, we propose a multi-scale spatiotemporal fusion and steady-state memory-driven load forecasting method for integrated energy systems (MSTF-SMDN). Extensive experiments on a public real-world dataset from Arizona State University demonstrate the superiority of the proposed approach: compared to the strongest baseline, MSTF-SMDN reduces cooling load RMSE by 16.09%, heating load RMSE by 12.97%, and electric load RMSE by 6.14%, while achieving R2 values of 0.99435, 0.98701, and 0.96722, respectively, confirming its feasibility, efficiency, and promising potential for multi-energy load forecasting in IES. Full article
Show Figures

Figure 1

22 pages, 2677 KB  
Article
A Hybrid Interval Prediction Framework for Photovoltaic Power Prediction Using BiLSTM–Transformer and Adaptive Kernel Density Estimation
by Laiyuan Li and Zhibin Li
Appl. Sci. 2026, 16(6), 3023; https://doi.org/10.3390/app16063023 - 20 Mar 2026
Viewed by 191
Abstract
Photovoltaic (PV) power forecasting is strongly influenced by volatility, randomness, and changing meteorological conditions, while conventional point forecasting provides limited uncertainty information for engineering use. This study proposes a hybrid interval forecasting framework for PV prediction. Similar-day clustering first segments weather data into [...] Read more.
Photovoltaic (PV) power forecasting is strongly influenced by volatility, randomness, and changing meteorological conditions, while conventional point forecasting provides limited uncertainty information for engineering use. This study proposes a hybrid interval forecasting framework for PV prediction. Similar-day clustering first segments weather data into distinct scenarios (sunny, cloudy and overcast) to reduce noise and redundant information within sequences, enhancing stability and thereby providing a more refined feature space for deep learning. A BiLSTM–Transformer model is then used as the core forecaster, taking multiple meteorological variables as multi-feature time-series inputs. BiLSTM captures bidirectional temporal dependencies, and the Transformer enhances long-range feature extraction via attention. To improve robustness and stability, the Alpha Evolution (AE) algorithm is applied for hyperparameter optimization, balancing global exploration and local refinement. For probabilistic forecasting, Adaptive Bandwidth Kernel Density Estimation (ABKDE) is employed to construct prediction intervals, where the local bandwidth is determined by minimizing a local error function to adapt to data density and error distribution. Case studies utilizing a full-year, 5 min high-resolution dataset from the DKASC station demonstrate that the proposed AE-BiLSTM–Transformer achieves highly accurate point forecasts across diverse weather conditions, reducing the RMSE by 81.85%, 76.99%, and 72.26% under sunny, cloudy, and overcast scenarios, respectively, compared to the baseline LSTM. ABKDE further produces reliable and compact intervals; at the 90% confidence level on sunny days, it achieves PICP = 0.921 with PINAW = 0.0378, reducing PINAW by 75.16% relative to conventional KDE while maintaining comparable coverage. Full article
Show Figures

Figure 1

16 pages, 1317 KB  
Article
Digital Gait Biomarkers for Parkinson’s Disease: Subject-Wise Validated Explainable AI Framework Using Vertical Ground Reaction Force Signals
by Moonhyeok Choi, Jaehyun Jo and Jinhyoung Jeong
Bioengineering 2026, 13(3), 360; https://doi.org/10.3390/bioengineering13030360 - 19 Mar 2026
Viewed by 433
Abstract
Parkinson’s disease (PD) is associated with progressive gait deterioration; however, widely used clinical scales such as the Hoehn & Yahr (H&Y) stage are limited in capturing continuous severity changes due to subjectivity and discrete grading. This study proposes a two-stage explainable AI framework [...] Read more.
Parkinson’s disease (PD) is associated with progressive gait deterioration; however, widely used clinical scales such as the Hoehn & Yahr (H&Y) stage are limited in capturing continuous severity changes due to subjectivity and discrete grading. This study proposes a two-stage explainable AI framework using vertical ground reaction force (VGRF) signals to achieve reproducible PD detection and continuous severity estimation. In the first stage, three deep learning models, temporal convolutional network (TCN), BiGRU with attention, and FCNN-Transformer, were trained using windowed VGRF signals under repeated subject-wise data segmentation. All models achieved high discrimination performance (AUC ≥ 0.93), with FCNN-Transformer showing the highest mean AUC (0.940) and statistically superior performance (paired Wilcoxon test, p < 0.05). Stability-based explainable AI using Integrated Gradients consistently identified variability-related VGRF features as the most informative, which were also significantly different between groups at the data level (p < 0.001, FDR-corrected). In the second stage, XGBoost regression was applied to PD subjects to predict continuous H&Y severity, achieving strong correlation with clinical grades (Spearman ρ = 0.921, p < 0.001), low error (MAE = 0.158, RMSE = 0.241), and high determination (R2 = 0.953). This shows that gait-based features are a sensitive enough signal to continuously quantify disease progression. In addition, in the TREND prospective longitudinal cohort (n = 696), wearable walking indicators differed significantly from those of non-patients prior to diagnosis, and a decline in walking pace was observed approximately four years before Parkinson’s disease diagnosis, providing the basis for early screening and monitoring using gait-based digital biomarkers. These results demonstrate that gait-based digital biomarkers can objectively quantify both PD presence and disease progression. The proposed framework provides a reproducible, explainable, and clinically interpretable AI-based decision support approach for PD assessment. Full article
Show Figures

Figure 1

21 pages, 1669 KB  
Article
Robust BEV Perception via Dual 4D Radar–Camera Fusion Under Adverse Conditions with Fog-Aware Enhancement
by Zhengqing Li and Baljit Singh
Electronics 2026, 15(6), 1284; https://doi.org/10.3390/electronics15061284 - 19 Mar 2026
Viewed by 244
Abstract
Bird’s-eye-view (BEV) perception has emerged as a key representation for unified scene understanding in autonomous driving. However, current BEV methods relying solely on monocular cameras suffer from severe degradation under adverse weather and dynamic scenes due to limited depth cues and illumination dependency. [...] Read more.
Bird’s-eye-view (BEV) perception has emerged as a key representation for unified scene understanding in autonomous driving. However, current BEV methods relying solely on monocular cameras suffer from severe degradation under adverse weather and dynamic scenes due to limited depth cues and illumination dependency. To address these challenges, we propose a robust multi-modal BEV perception framework that integrates dual-source 4D millimeter-wave radar and multi-view camera images. The proposed architecture systematically exploits Doppler velocity and temporal information from 4D radar to model dynamic object motion, while introducing a deformable fusion strategy in the BEV space for accurate semantic alignment across modalities. Our design includes four key modules: a Doppler-Aware Radar Encoder (DARE) that enhances motion-sensitive features via velocity-guided attention; a Fog-Aware Feature Denoising Module (FADM) that suppresses modality inconsistency in low-visibility conditions through cross-modal attention and residual enhancement; a Multi-Modal Temporal Fusion Module (TFM) that encodes radar temporal sequences using a Transformer encoder for motion continuity modeling; and a confidence-aware multi-task loss that jointly supervises semantic segmentation, motion estimation, and object detection. Extensive experiments on the DualRadar dataset and adverse-weather simulations demonstrate that our method achieves significant gains over state-of-the-art baselines in BEV segmentation accuracy, detection robustness, and motion stability. The proposed framework offers a scalable and resilient solution for real-world autonomous perception, especially under challenging environmental conditions. Full article
(This article belongs to the Special Issue Image Processing Based on Convolution Neural Network: 2nd Edition)
Show Figures

Figure 1

26 pages, 3519 KB  
Article
Subject-Independent Depression Recognition from EEG Using an Improved Bidirectional LSTM with Dynamic Vector Routing
by Ziqi Ji, Kunye Liu, Weikai Ma, Xiaolin Ning and Yang Gao
Bioengineering 2026, 13(3), 358; https://doi.org/10.3390/bioengineering13030358 - 19 Mar 2026
Viewed by 445
Abstract
Electroencephalography (EEG) has become an increasingly important tool in depression research due to its ability to capture objective neurophysiological abnormalities associated with depressive disorders, offering high temporal resolution, non-invasiveness, and cost-effectiveness.However, existing methods often fail to fully exploit the multi-domain information in EEG [...] Read more.
Electroencephalography (EEG) has become an increasingly important tool in depression research due to its ability to capture objective neurophysiological abnormalities associated with depressive disorders, offering high temporal resolution, non-invasiveness, and cost-effectiveness.However, existing methods often fail to fully exploit the multi-domain information in EEG signals, resulting in limited model generalization capabilities. This paper proposes an improved bidirectional long short-term memory (BiLSTM) model that segments continuous EEG into non-overlapping 2-s epochs and learns end-to-end from multi-channel temporal sequences. After band-pass filtering and resampling, each epoch is represented as a channel–time matrix XRC×T (with C = 128) and processed by a BiLSTM encoder followed by a dynamic-routing encapsulated-vector classifier. On the MODMA dataset under subject-independent five-fold cross-validation, the proposed method outperforms a set of reproduced representative baselines (SVM, EEGNet, InceptionNet, Self-attention-CNN and CNN–LSTM) and achieves 84.8% accuracy with an AUC of 0.899. We further discuss recent contemporary directions (e.g., attention/Transformer-based and emotion-aware expert models) and clarify the scope of our empirical comparisons. Furthermore, experiments comparing different frequency bands and band combinations indicate that joint multi-frequency input can enhance classification performance. This study provides an effective multi-domain fusion approach for the automatic diagnosis of depression based on EEG. Full article
(This article belongs to the Section Biosignal Processing)
Show Figures

Graphical abstract

19 pages, 1198 KB  
Article
GSMTNet: Dual-Stream Video Anomaly Detection via Gated Spatio-Temporal Graph and Multi-Scale Temporal Learning
by Di Jiang, Huicheng Lai, Guxue Gao, Dan Ma and Liejun Wang
Electronics 2026, 15(6), 1200; https://doi.org/10.3390/electronics15061200 - 13 Mar 2026
Viewed by 268
Abstract
Video Anomaly Detection aims to identify video segments containing abnormal events. However, detecting anomalies relies more heavily on temporal modeling, particularly when anomalies exhibit only subtle deviations from normal events. However, most existing methods inadequately model the heterogeneity in spatiotemporal relationships, especially the [...] Read more.
Video Anomaly Detection aims to identify video segments containing abnormal events. However, detecting anomalies relies more heavily on temporal modeling, particularly when anomalies exhibit only subtle deviations from normal events. However, most existing methods inadequately model the heterogeneity in spatiotemporal relationships, especially the dynamic interactions between human pose and video appearance. To address this, we propose GSMTNet, a dual-stream heterogeneous unsupervised network integrating gated spatio-temporal graph convolution and multi-scale temporal learning. First, we introduce a dynamic graph structure learning module, which leverages gated spatio-temporal graph convolutions with manifold transformations to model latent spatial relationships via human pose graphs. This is coupled with a normalizing flow-based density estimation module to model the probability distribution of normal samples in a latent space. Second, we design a hybrid dilated temporal module that employs multi-scale temporal feature learning to simultaneously capture long- and short-term dependencies, thereby enhancing the separability between normal patterns and potential deviations. Finally, we propose a dual-stream fusion module to hierarchically integrate features learned from pose graphs and raw video sequences, followed by a prediction head that computes anomaly scores from the fused features. Extensive experiments demonstrate state-of-the-art performance, achieving 86.81% AUC on ShanghaiTech and 70.43% on UBnormal, outperforming existing methods in rare anomaly scenarios. Full article
Show Figures

Figure 1

81 pages, 28674 KB  
Article
Representation Learning for Maritime Vessel Behaviour: A Three-Stage Pipeline for Robust Trajectory Embeddings
by Ghassan Al-Falouji, Shang Gao, Zhixin Huang, Ben Biesenbach, Peer Kröger, Bernhard Sick and Sven Tomforde
J. Mar. Sci. Eng. 2026, 14(5), 507; https://doi.org/10.3390/jmse14050507 - 8 Mar 2026
Viewed by 255
Abstract
The growing complexity of maritime navigation creates safety challenges that drive the shift toward autonomous systems. Maritime vessel behaviour modelling is critical for safe and efficient autonomous operations. Representation learning offers a systematic approach to learn feature embeddings encoding vessel behaviour for improved [...] Read more.
The growing complexity of maritime navigation creates safety challenges that drive the shift toward autonomous systems. Maritime vessel behaviour modelling is critical for safe and efficient autonomous operations. Representation learning offers a systematic approach to learn feature embeddings encoding vessel behaviour for improved situational awareness and decision-making. We introduce a three-stage representation learning pipeline evaluating six architectures on real-world AIS trajectories. Grouped Masked Autoencoder (GMAE)-Risk Extrapolation (REx) combines group-wise masked autoencoding at the semantic feature level with risk extrapolation regularisation, forcing encoders to learn cross-group dependencies between temporal, kinematic, spatial, and interaction features. DAE and EAE provide robust and uncertainty-aware baselines. Evaluation uses a dual-pipeline framework on two years of Kiel Fjord AIS data (176,787 trajectories, 527,225 segments). Pipeline 1 applies three-stage representation learning using vessel-type classification as encoder selection probe. GMAE-REx achieves 86.03% validation accuracy, outperforming DAE (85.63%), EAE (85.56%), and baselines Transformer (84.93%), TCN (76.27%), LiST (85.12%). Pipeline 2 applies unsupervised clustering to discover intrinsic behavioural structure. Learnt representations consistently outperform expert features on DBCV, conductance, and modularity metrics, organising trajectories by operational context rather than vessel type. This behaviour-oriented organisation enables cross-vessel knowledge transfer for autonomous navigation, VTS monitoring, and safety analysis. Full article
(This article belongs to the Special Issue Intelligent Solutions for Marine Operations)
Show Figures

Figure 1

18 pages, 2330 KB  
Article
An Explainable Time-Series Knowledge Graph Framework with Dynamic Temporal Segmentation for Industrial Spindle Health Monitoring
by Chun-Shih Cheng and Guan-Ju Peng
Machines 2026, 14(3), 291; https://doi.org/10.3390/machines14030291 - 4 Mar 2026
Viewed by 255
Abstract
This study presents an explainable knowledge graph (KG) framework that transforms continuous spindle monitoring time-series data into transparent, reasoning-ready diagnostic structures. Existing data-driven approaches, while accurate, often lack the interpretability required for high-stakes industrial decision-making and are sensitive to operating condition drifts. To [...] Read more.
This study presents an explainable knowledge graph (KG) framework that transforms continuous spindle monitoring time-series data into transparent, reasoning-ready diagnostic structures. Existing data-driven approaches, while accurate, often lack the interpretability required for high-stakes industrial decision-making and are sensitive to operating condition drifts. To address these limitations, we propose a two-level temporal segmentation method combining label transition detection and statistical drift analysis to identify meaningful state boundaries. Furthermore, a percentile-based discretization mechanism converts statistical features into interpretable semantic tags. A Neo4j-based state–event–feature schema captures lifecycle evolution and evidence relations, enabling attribution path reasoning that links failure events to salient precursor features. Experiments on real industrial spindle data demonstrate a fault detection accuracy of 84.97% and a false alarm rate of 3.43%, effectively capturing stable baselines and intermittent abnormal bursts. The proposed framework provides a distinct novelty in bridging the gap between numerical time-series and symbolic reasoning, offering a practical pathway for deploying explainable and maintainable spindle health analytics. Full article
(This article belongs to the Section Industrial Systems)
Show Figures

Figure 1

22 pages, 340 KB  
Article
From Patient Emotion Recognition to Provider Understanding: A Multimodal Data Mining Framework for Emotion-Aware Clinical Counseling Systems
by Saahithi Mallarapu, Xinyan Liu, Pegah Zargarian, Seyyedeh Fatemeh Mottaghian, Ramyashree Suresha, Vasudha Jain and Akram Bayat
Computers 2026, 15(3), 161; https://doi.org/10.3390/computers15030161 - 3 Mar 2026
Viewed by 352
Abstract
Computational analysis of therapeutic communication presents challenges in multi-label classification, severe class imbalance, and heterogeneous multimodal data integration. We introduce a bidirectional analytical framework addressing patient emotion recognition and provider behavior analysis. For patient-side analysis, we employ ClinicalBERT on human-annotated CounselChat (1482 interactions, [...] Read more.
Computational analysis of therapeutic communication presents challenges in multi-label classification, severe class imbalance, and heterogeneous multimodal data integration. We introduce a bidirectional analytical framework addressing patient emotion recognition and provider behavior analysis. For patient-side analysis, we employ ClinicalBERT on human-annotated CounselChat (1482 interactions, 25 categories, imbalance 60:1), achieving a macro-F1 of 0.74 through class weighting and threshold optimization, representing a six-fold improvement over naive baselines and 6–13 point improvement over modern imbalance methods. For provider-side analysis, we process 330 YouTube therapy sessions through automated pipelines (speaker diarization, automatic speech recognition, temporal segmentation), yielding 14,086 annotated segments. Our architecture combines DeBERTa-v3-base with WavLM-base-plus through cross-modal attention mechanisms adapted from multimodal Transformer frameworks. On controlled human-annotated HOPE data (178 sessions, 12,500 utterances), the model achieves a macro-F1 of 0.91 with Cohen’s kappa of 0.87, comparable to inter-rater reliability reported in psychotherapy process research. On YouTube data, a macro-F1 of 0.71 demonstrates feasibility while highlighting annotation quality impacts. Cross-dataset transfer and systematic attention analyses validate domain-specific effectiveness and interpretability. Full article
Show Figures

Figure 1

24 pages, 1346 KB  
Systematic Review
Artificial Intelligence in Cadastre: A Systematic Review of Methods, Applications, and Trends
by Jingshu Chen, Majid Nazeer, Bo Sum Lee and Man Sing Wong
Land 2026, 15(3), 411; https://doi.org/10.3390/land15030411 - 2 Mar 2026
Viewed by 645
Abstract
Surveying and register administration are core to land administration, and accordingly, land surveying and registration are essential to socio-economic development due to their potential accuracy and efficiency. Until now, customary land surveying and registration have relied on human input, which is a situation [...] Read more.
Surveying and register administration are core to land administration, and accordingly, land surveying and registration are essential to socio-economic development due to their potential accuracy and efficiency. Until now, customary land surveying and registration have relied on human input, which is a situation that undermines efficiency and is prone to errors in data handling. During the last decade, the exponential growth in artificial intelligence (AI), in particular, geospatial artificial intelligence (GeoAI), has provided new methodologies that can overcome these deficiencies. This review examines AI in cadastral management by analyzing technical solutions and trends across three areas including data collection, modeling, and common applications. This review aims to provide a comprehensive survey of the current use of AI in cadastral management to the extent of defining a future research avenue. Based on the comprehensive review of literature, this study has reached the following three conclusions. (1) Automated extraction of parcel boundaries has been achieved through deep learning in data collection and processing, removing the bottlenecks of manual interpretation. Models such as convolutional neural networks (CNNs) and Transformers have been used for pixel-level semantic segmentation of high-resolution remote sensing images, leading to significant improvements in efficiency and accuracy. (2) Non-spatial data have been processed with natural language processing techniques to automatically extract information and construct relationships, thus overcoming the limitations of paper-based archives and traditional relational databases. (3) Deep learning models have been applied to automatically detect parcel changes and to enable integrated analysis of spatial and non-spatial data, which has supported the transition of cadastral management from two-dimensional to three-dimensional. However, several challenges remain, including differences in multi-temporal data processing, spatial semantic ambiguity, and the lack of large-scale, high-quality annotated data. Future research can focus on improving model generalization, advancing cross-modal data fusion, and providing recommendations for the development of a reliable and practical intelligent cadastral system. Full article
Show Figures

Figure 1

29 pages, 12396 KB  
Article
Multi-Channel SCADA-Based Image-Driven Power Prediction for Wind Turbines Using Optimized LeNet-5-LSTM Hybrid Neural Architecture
by Muhammad Ahsan and Phong Ba Dao
Energies 2026, 19(5), 1169; https://doi.org/10.3390/en19051169 - 26 Feb 2026
Viewed by 313
Abstract
Accurate power prediction is essential for assessing wind turbine performance under real-world operating conditions and for supporting condition monitoring and maintenance planning using SCADA data. Most existing approaches rely directly on raw SCADA signals, which may limit their ability to capture complex spatiotemporal [...] Read more.
Accurate power prediction is essential for assessing wind turbine performance under real-world operating conditions and for supporting condition monitoring and maintenance planning using SCADA data. Most existing approaches rely directly on raw SCADA signals, which may limit their ability to capture complex spatiotemporal dependencies among operational variables. To address this limitation, this paper proposes a novel SCADA-driven power prediction framework that transforms selected SCADA variables into multi-channel grayscale images and leverages an optimized LeNet-5–LSTM hybrid neural network for active and reactive power prediction. First, the SCADA dataset is analyzed to identify the most influential variables affecting power output. Six key variables are then selected, segmented, and encoded as 2D grayscale images, enabling the model to learn richer feature representations compared to conventional raw SCADA data-based methods. The proposed network combines convolutional layers for spatial feature extraction from SCADA data-based grayscale images with LSTM layers to capture temporal dependencies. Model training incorporates a customized loss function that integrates both data-driven supervision and physics-based constraints. The model is trained using 70% of the image-based dataset, with five independent runs to ensure robustness and reproducibility, while the remaining 30% is used for testing. The proposed approach is validated using SCADA data from three real-world cases: (i) a 2 MW Siemens wind turbine in Poland, (ii) a Vestas V52 wind turbine in Ireland, and (iii) the La Haute Borne wind farm in France, consisting of four wind turbines. The results demonstrate that the SCADA-based image representation enables the proposed LeNet-5–LSTM model to effectively learn discriminative feature patterns and achieve accurate active and reactive power predictions across different turbine types and operating conditions. Full article
(This article belongs to the Special Issue Machine Learning in Renewable Energy Resource Assessment)
Show Figures

Figure 1

15 pages, 4263 KB  
Article
Driver Attention Prediction Based on Adaptive Fusion of Cross-Modal Features
by Mingfang Zhang, Tong Zhang, Congling Yan and Yiran Zhang
Appl. Sci. 2026, 16(4), 2150; https://doi.org/10.3390/app16042150 - 23 Feb 2026
Viewed by 332
Abstract
To investigate the dynamic changes in driver attention in complex road traffic scenarios, this paper proposes a driver attention prediction method based on cross-modal adaptive feature fusion (DAFNet). First, semantic segmentation is applied to the input image sequences, and a dual-branch encoder using [...] Read more.
To investigate the dynamic changes in driver attention in complex road traffic scenarios, this paper proposes a driver attention prediction method based on cross-modal adaptive feature fusion (DAFNet). First, semantic segmentation is applied to the input image sequences, and a dual-branch encoder using a 3D residual network is designed to extract spatio-temporal features from both RGB images and semantic information in parallel. Next, a 3D deformable attention mechanism is introduced to enhance the traditional Transformer algorithm, which focuses on the key salient regions through spatio-temporal offset prediction and adaptive fusion of cross-modal features. Subsequently, a predictive recurrent neural network is employed to forecast the fused spatio-temporal features and improve the stability of long-term sequence prediction. Finally, the driver attention results are predicted by a lightweight decoder. Experimental results demonstrate that the proposed method outperforms the comparative methods in overall performance. The predictions not only capture salient regions in driving scenes in a bottom-up manner but also track the driver’s intent in a top-down manner. Thus, our method exhibits strong adaptability to various complex traffic scenarios. Additionally, the method achieves an inference speed of 53.73 frames per second, satisfying the real-time performance requirement of on-vehicle systems. Full article
Show Figures

Figure 1

22 pages, 5296 KB  
Article
Pepper-4D: Spatiotemporal 3D Pepper Crop Dataset for Phenotyping
by Foysal Ahmed, Dawei Li, Boyuan Zhao, Zhanjiang Wang, Jiali Huang, Tingzhicheng Li, Jingjing Huang, Jiahui Hou, Sayed Jobaer and Han Yan
Plants 2026, 15(4), 599; https://doi.org/10.3390/plants15040599 - 13 Feb 2026
Viewed by 676
Abstract
Pepper (Capsicum annuum) is a globally significant horticultural crop cultivated for its culinary, medicinal, and economic value. Traditional approaches for boosting the agricultural production of pepper, notably, expanding farmland, have become increasingly unsustainable. Recent advancements in artificial intelligence and 3D computer [...] Read more.
Pepper (Capsicum annuum) is a globally significant horticultural crop cultivated for its culinary, medicinal, and economic value. Traditional approaches for boosting the agricultural production of pepper, notably, expanding farmland, have become increasingly unsustainable. Recent advancements in artificial intelligence and 3D computer vision have started to transform crop cultivation and phenotyping, which has shed new light on increasing production by advanced breeding. However, currently, the field still lacks 3D pepper data that contains enough detail for organ-level analysis. Therefore, we propose Pepper-4D, a new, high-precision 4D point cloud dataset that records both the spatial structure and temporal development of pepper plants across various continuous growth stages. Our dataset is divided into three subsets, including a total of 916 individual point clouds from 29 indoor-cultivated pepper plant samples. Our dataset provides manual annotations at both the plant-level and organ-level, supporting phenotyping tasks such as pepper growth status classification, organ semantic segmentation, organ instance segmentation, organ growth tracking, new organ detection, and even the generation of synthetic 3D pepper plants. Full article
(This article belongs to the Special Issue AI-Driven Machine Vision Technologies in Plant Science)
Show Figures

Graphical abstract

Back to TopTop