MDPI - Publisher of Open Access Journals

16 pages, 2759 KB

Open AccessArticle

Machine Learning-Based Position Detection Using Hall-Effect Sensor Arrays on Resource-Constrained Microcontroller

by Zalán Németh, Chan Hwang See, Keng Goh, Arfan Ghani, Simeon Keates and Raed A. Abd-Alhameed

Sensors 2025, 25(20), 6444; https://doi.org/10.3390/s25206444 (registering DOI) - 18 Oct 2025

This paper presents an electromagnetic levitation system that stabilizes a magnetic body using an array of electromagnets controlled by a Hall-effect sensor array and TinyML-based position detection. Departing from conventional optical tracking methods, the proposed design combines finite-element-optimized electromagnets with a microcontroller-optimized neural [...] Read more.

This paper presents an electromagnetic levitation system that stabilizes a magnetic body using an array of electromagnets controlled by a Hall-effect sensor array and TinyML-based position detection. Departing from conventional optical tracking methods, the proposed design combines finite-element-optimized electromagnets with a microcontroller-optimized neural network that processes sensor data to predict the levitated object’s position with 0.0263–0.0381 mm mean absolute error. The system employs both quantized and full-precision implementations of a supervised multi-output regression model trained on spatially sampled data (40 × 40 × 15 mm volume at 5 mm intervals). Comprehensive benchmarking demonstrates stable operation at 850–1000 Hz control frequencies, matching optical systems’ performance while eliminating their cost and complexity. The integrated solution performs real-time position detection and current calculation entirely on-board, requiring no external tracking devices or high-performance computing. By achieving sub 30 μm accuracy with standard microcontrollers and minimal hardware, this work validates machine learning as a viable alternative to optical position detection in magnetic levitation systems, reducing implementation barriers for research and industrial applications. The complete system design, including electromagnetic array characterization, neural network architecture selection, and real-time implementation challenges, is presented alongside performance comparisons with conventional approaches. Full article

(This article belongs to the Special Issue Magnetic Field Sensing and Measurement Techniques)

► Show Figures

Figure 1

23 pages, 2837 KB

Open AccessArticle

Spatial Error Prediction and Compensation of Industrial Robots Based on Extended Joints and BO-XGBoost

by Bingran Yang and Xuedong Jing

Sensors 2025, 25(20), 6422; https://doi.org/10.3390/s25206422 - 17 Oct 2025

Abstract

Robotic positioning accuracy is paramount in complex tasks. This accuracy is influenced by both geometric and non-geometric factors, making error prediction a significant challenge. To address this, this paper introduces two key contributions. First, we propose a novel input feature, the robot’s “extended [...] Read more.

Robotic positioning accuracy is paramount in complex tasks. This accuracy is influenced by both geometric and non-geometric factors, making error prediction a significant challenge. To address this, this paper introduces two key contributions. First, we propose a novel input feature, the robot’s “extended joint angles,” which incorporates joint reversal information to better capture non-geometric errors like gear backlash. Second, we develop a high-accuracy spatial error prediction model by combining the Extreme Gradient Boosting (XGBoost) algorithm with Bayesian Optimization (BO) for hyperparameter tuning. The BO-XGBoost model establishes a direct non-linear mapping from the extended joint angles to the positioning error. Experimental results demonstrate that after compensation, the mean position error was reduced from 1.0751 mm to 0.1008 mm (a 90.62% decrease), the maximum error from 3.3884 mm to 0.4782 mm (an 85.88% decrease), and the standard deviation from 0.5383 mm to 0.0832 mm (an 84.54% decrease). A comparative analysis against Decision Tree, K-Nearest Neighbors, and Random Forest models further validates the superiority of the proposed method in reducing robot position error. Full article

(This article belongs to the Special Issue Advanced Robotic Manipulators and Control Applications)

17 pages, 1005 KB

Open AccessArticle

Leveraging Clinical Record Geolocation for Improved Alzheimer’s Disease Diagnosis Using DMV Framework

by Peng Zhang and Divya Chaudhary

Biomedicines 2025, 13(10), 2496; https://doi.org/10.3390/biomedicines13102496 - 14 Oct 2025

Viewed by 326

Abstract

Background: Early detection of Alzheimer’s disease (AD) is critical for timely intervention, but clinical assessments and neuroimaging are often costly and resource intensive. Natural language processing (NLP) of clinical records offers a scalable alternative, and integrating geolocation may capture complementary environmental risk signals. [...] Read more.

Background: Early detection of Alzheimer’s disease (AD) is critical for timely intervention, but clinical assessments and neuroimaging are often costly and resource intensive. Natural language processing (NLP) of clinical records offers a scalable alternative, and integrating geolocation may capture complementary environmental risk signals. Methods: We propose the DMV (Data processing, Model training, Validation) framework that frames early AD detection as a regression task predicting a continuous risk score (“data_value”) from clinical text and structured features. We evaluated embeddings from Llama3-70B, GPT-4o (via text-embedding-ada-002), and GPT-5 (text-embedding-3-large) combined with a Random Forest regressor on a CDC-derived dataset (≈284 k records). Models were trained and assessed using 10-fold cross-validation. Performance metrics included Mean Squared Error (MSE), Mean Absolute Error (MAE), and R²; paired t-tests and Wilcoxon signed-rank tests assessed statistical significance. Results: Including geolocation (latitude and longitude) consistently improved performance across models. For the Random Forest baseline, MSE decreased by 48.6% when geolocation was added. Embedding-based models showed larger gains; GPT-5 with geolocation achieved the best results (MSE = 14.0339, MAE = 2.3715, R² = 0.9783), and the reduction in error from adding geolocation was statistically significant (p < 0.001, paired tests). Conclusions: Combining high-quality text embeddings with patient geolocation yields substantial and statistically significant improvements in AD risk estimation. Incorporating spatial context alongside clinical text may help clinicians account for environmental and regional risk factors and improve early detection in scalable, data-driven workflows. Full article

(This article belongs to the Special Issue Neurodevelopmental Disorders: From Pathophysiology to Novel Therapeutic Approaches—2nd Edition)

► Show Figures

Figure 1

26 pages, 5440 KB

Open AccessArticle

Improved Streamflow Forecasting Through SWE-Augmented Spatio-Temporal Graph Neural Networks

by Akhila Akkala, Soukaina Filali Boubrahimi, Shah Muhammad Hamdi, Pouya Hosseinzadeh and Ayman Nassar

Hydrology 2025, 12(10), 268; https://doi.org/10.3390/hydrology12100268 - 11 Oct 2025

Viewed by 356

Abstract

Streamflow forecasting in snowmelt-dominated basins is essential for water resource planning, flood mitigation, and ecological sustainability. This study presents a comparative evaluation of statistical, machine learning (Random Forest), and deep learning models (Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Spatio-Temporal Graph [...] Read more.

Streamflow forecasting in snowmelt-dominated basins is essential for water resource planning, flood mitigation, and ecological sustainability. This study presents a comparative evaluation of statistical, machine learning (Random Forest), and deep learning models (Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Spatio-Temporal Graph Neural Network (STGNN)) using 30 years of data from 20 monitoring stations across the Upper Colorado River Basin (UCRB). We assess the impact of integrating meteorological variables—particularly, the Snow Water Equivalent (SWE)—and spatial dependencies on predictive performance. Among all models, the Spatio-Temporal Graph Neural Network (STGNN) achieved the highest accuracy, with a Nash–Sutcliffe Efficiency (NSE) of 0.84 and Kling–Gupta Efficiency (KGE) of 0.84 in the multivariate setting at the critical downstream node, Lees Ferry. Compared to the univariate setup, SWE-enhanced predictions reduced Root Mean Square Error (RMSE) by 12.8%. Seasonal and spatial analyses showed the greatest improvements at high-elevation and mid-network stations, where snowmelt dynamics dominate runoff. These findings demonstrate that spatio-temporal learning frameworks, especially STGNNs, provide a scalable and physically consistent approach to streamflow forecasting under variable climatic conditions. Full article

► Show Figures

Figure 1

31 pages, 2953 KB

Open AccessArticle

A Balanced Multimodal Multi-Task Deep Learning Framework for Robust Patient-Specific Quality Assurance

by Xiaoyang Zeng, Awais Ahmed and Muhammad Hanif Tunio

Diagnostics 2025, 15(20), 2555; https://doi.org/10.3390/diagnostics15202555 - 10 Oct 2025

Viewed by 361

Abstract

Background: Multimodal Deep learning has emerged as a crucial method for automated patient-specific quality assurance (PSQA) in radiotherapy research. Integrating image-based dose matrices with tabular plan complexity metrics enables more accurate prediction of quality indicators, including the Gamma Passing Rate (GPR) and dose [...] Read more.

Background: Multimodal Deep learning has emerged as a crucial method for automated patient-specific quality assurance (PSQA) in radiotherapy research. Integrating image-based dose matrices with tabular plan complexity metrics enables more accurate prediction of quality indicators, including the Gamma Passing Rate (GPR) and dose difference (DD). However, modality imbalance remains a significant challenge, as tabular encoders often dominate training, suppressing image encoders and reducing model robustness. This issue becomes more pronounced under task heterogeneity, with GPR prediction relying more on tabular data, whereas dose difference prediction (DDP) depends heavily on image features. Methods: We propose BMMQA (Balanced Multi-modal Quality Assurance), a novel framework that achieves modality balance by adjusting modality-specific loss factors to control convergence dynamics. The framework introduces four key innovations: (1) task-specific fusion strategies (softmax-weighted attention for GPR regression and spatial cascading for DD prediction); (2) a balancing mechanism supported by Shapley values to quantify modality contributions; (3) a fast network forward mechanism for efficient computation of different modality combinations; and (4) a modality-contribution-based task weighting scheme for multi-task multimodal learning. A large-scale multimodal dataset comprising 1370 IMRT plans was curated in collaboration with Peking Union Medical College Hospital (PUMCH). Results: Experimental results demonstrate that, under the standard 2%/3 mm GPR criterion, BMMQA outperforms existing fusion baselines. Under the stricter 2%/2 mm criterion, it achieves a 15.7% reduction in mean absolute error (MAE). The framework also enhances robustness in critical failure cases (GPR < 90%) and achieves a peak SSIM of 0.964 in dose distribution prediction. Conclusions: Explicit modality balancing improves predictive accuracy and strengthens clinical trustworthiness by mitigating overreliance on a single modality. This work highlights the importance of addressing modality imbalance for building trustworthy and robust AI systems in PSQA and establishes a pioneering framework for multi-task multimodal learning. Full article

(This article belongs to the Special Issue Deep Learning in Medical and Biomedical Image Processing)

► Show Figures

Figure 1

13 pages, 2381 KB

Open AccessArticle

DCNN–Transformer Hybrid Network for Robust Feature Extraction in FMCW LiDAR Ranging

by Wenhao Xu, Pansong Zhang, Guohui Yuan, Shichang Xu, Longfei Li, Junxiang Zhang, Longfei Li, Tianyu Li and Zhuoran Wang

Photonics 2025, 12(10), 995; https://doi.org/10.3390/photonics12100995 - 10 Oct 2025

Viewed by 301

Abstract

Frequency-Modulated Continuous-Wave (FMCW) Laser Detection and Ranging (LiDAR) systems are widely used due to their high accuracy and resolution. Nevertheless, conventional distance extraction methods often lack robustness in noisy and complex environments. To address this limitation, we propose a deep learning-based signal extraction [...] Read more.

Frequency-Modulated Continuous-Wave (FMCW) Laser Detection and Ranging (LiDAR) systems are widely used due to their high accuracy and resolution. Nevertheless, conventional distance extraction methods often lack robustness in noisy and complex environments. To address this limitation, we propose a deep learning-based signal extraction framework that integrates a Dual Convolutional Neural Network (DCNN) with a Transformer model. The DCNN extracts multi-scale spatial features through multi-layer and pointwise convolutions, while the Transformer employs a self-attention mechanism to capture global temporal dependencies of the beat-frequency signals. The proposed DCNN–Transformer network is evaluated through beat-frequency signal inversion experiments across distances ranging from 3 m to 40 m. The experimental results show that the method achieves a mean absolute error (MAE) of 4.1 mm and a root-mean-square error (RMSE) of 3.08 mm. These results demonstrate that the proposed approach provides stable and accurate predictions, with strong generalization ability and robustness for FMCW LiDAR systems. Full article

(This article belongs to the Section Optical Interaction Science)

► Show Figures

Figure 1

20 pages, 4466 KB

Open AccessArticle

SA-STGCN: A Spectral-Attentive Spatio-Temporal Graph Convolutional Network for Wind Power Forecasting with Wavelet-Enhanced Multi-Scale Learning

by Yakai Yang, Zhenqing Liu and Zhongze Yu

Energies 2025, 18(19), 5315; https://doi.org/10.3390/en18195315 - 9 Oct 2025

Viewed by 368

Abstract

Wind power forecasting remains a major challenge for renewable energy integration, as conventional models often perform poorly when confronted with complex atmospheric dynamics. This study addresses the problem by developing a Spectral-Attentive Spatio-Temporal Graph Convolutional Network (SA-STGCN) designed to capture the intricate temporal [...] Read more.

Wind power forecasting remains a major challenge for renewable energy integration, as conventional models often perform poorly when confronted with complex atmospheric dynamics. This study addresses the problem by developing a Spectral-Attentive Spatio-Temporal Graph Convolutional Network (SA-STGCN) designed to capture the intricate temporal and spatial dependencies of wind systems. The approach first applies wavelet transform decomposition to separate volatile wind signals into distinct frequency components, enabling more interpretable representation of rapidly changing conditions. A dynamic temporal attention mechanism is then employed to adaptively identify historical patterns that are most relevant for prediction, moving beyond the fixed temporal windows used in many existing methods. In addition, spectral graph convolution is conducted in the frequency domain to capture farm-wide spatial correlations, thereby modeling long-range atmospheric interactions that conventional localized methods overlook. Although this design increases computational complexity, it proves critical for representing wind variability. Evaluation on real-world datasets demonstrates that SA-STGCN achieves substantial accuracy improvements, with a mean absolute error of 1.52 and a root mean square error of 2.31. These results suggest that embracing more expressive architectures can yield reliable forecasting performance, supporting the stable integration of wind power into modern energy systems. Full article

(This article belongs to the Section A3: Wind, Wave and Tidal Energy)

► Show Figures

Figure 1

29 pages, 5154 KB

Open AccessArticle

Spatial-Frequency-Scale Variational Autoencoder for Enhanced Flow Diagnostics of Schlieren Data

by Ronghua Yang, Hao Wu, Rongfei Yang, Xingshuang Wu, Yifan Song, Meiying Lü and Mingrui Wang

Sensors 2025, 25(19), 6233; https://doi.org/10.3390/s25196233 - 8 Oct 2025

Viewed by 394

Abstract

Schlieren imaging is a powerful optical sensing technique that captures flow-induced refractive index gradients, offering valuable visual data for analyzing complex fluid dynamics. However, the large volume and structural complexity of the data generated by this sensor pose significant challenges for extracting key [...] Read more.

Schlieren imaging is a powerful optical sensing technique that captures flow-induced refractive index gradients, offering valuable visual data for analyzing complex fluid dynamics. However, the large volume and structural complexity of the data generated by this sensor pose significant challenges for extracting key physical insights and performing efficient reconstruction and temporal prediction. In this study, we propose a Spatial-Frequency-Scale variational autoencoder (SFS-VAE), a deep learning framework designed for the unsupervised feature decomposition of Schlieren sensor data. To address the limitations of traditional

β

-variational autoencoder (

β

-VAE) in capturing complex flow regions, the Progressive Frequency-enhanced Spatial Multi-scale Module (PFSM) is designed, which enhances the structures of different frequency bands through Fourier transform and multi-scale convolution; the Feature-Spatial Enhancement Module (FSEM) employs a gradient-driven spatial attention mechanism to extract key regional features. Experiments on flat plate film-cooled jet schlieren data show that SFS-VAE can effectively preserve the information of the mainstream region and more accurately capture the high-gradient features of the jet region, reducing the Root Mean Square Error (RMSE) by approximately 16.9% and increasing the Peak Signal-to-Noise Ratio (PSNR) by approximately 1.6 dB. Furthermore, when integrated with a Transformer for temporal prediction, the model exhibits significantly improved stability and accuracy in forecasting flow field evolution. Overall, the model’s physical interpretability and generalization ability make it a powerful new tool for advanced flow diagnostics through the robust analysis of Schlieren sensor data. Full article

(This article belongs to the Section Optical Sensors)

► Show Figures

Figure 1

24 pages, 22010 KB

Open AccessArticle

Improving the Temporal Resolution of Land Surface Temperature Using Machine and Deep Learning Models

by Mohsen Niroomand, Parham Pahlavani, Behnaz Bigdeli and Omid Ghorbanzadeh

Geomatics 2025, 5(4), 50; https://doi.org/10.3390/geomatics5040050 - 1 Oct 2025

Viewed by 393

Abstract

Land Surface Temperature (LST) is a critical parameter for analyzing urban heat islands, surface–atmosphere interactions, and environmental management. This study enhances the temporal resolution of LST data by leveraging machine learning and deep learning models. A novel methodology was developed using Landsat 8 [...] Read more.

Land Surface Temperature (LST) is a critical parameter for analyzing urban heat islands, surface–atmosphere interactions, and environmental management. This study enhances the temporal resolution of LST data by leveraging machine learning and deep learning models. A novel methodology was developed using Landsat 8 thermal data and Sentinel-2 multispectral imagery to predict LST at finer temporal intervals in an urban setting. Although Sentinel-2 lacks a thermal band, its high-resolution multispectral data, when integrated with Landsat 8 thermal observations, provide valuable complementary information for LST estimation. Several models were employed for LST prediction, including Random Forest Regression (RFR), Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM) network, and Gated Recurrent Unit (GRU). Model performance was assessed using the coefficient of determination (R²) and Mean Absolute Error (MAE). The CNN model demonstrated the highest predictive capability, achieving an R² of 74.81% and an MAE of 1.588 °C. Feature importance analysis highlighted the role of spectral bands, spectral indices, topographic parameters, and land cover data in capturing the dynamic complexity of LST variations and directional patterns. A refined CNN model, trained with the features exhibiting the highest correlation with the reference LST, achieved an improved R² of 84.48% and an MAE of 1.19 °C. These results underscore the importance of a comprehensive analysis of the factors influencing LST, as well as the need to consider the specific characteristics of the study area. Additionally, a modified TsHARP approach was applied to enhance spatial resolution, though its accuracy remained lower than that of the CNN model. The study was conducted in Tehran, a rapidly urbanizing metropolis facing rising temperatures, heavy traffic congestion, rapid horizontal expansion, and low energy efficiency. The findings contribute to urban environmental management by providing high-temporal-resolution LST data, essential for mitigating urban heat islands and improving climate resilience. Full article

► Show Figures

Figure 1

23 pages, 12353 KB

Open AccessArticle

Cross-Media Infrared Measurement and Temperature Rise Characteristic Analysis of Coal Mine Electrical Equipment

by Xusheng Xue, Jianxin Yang, Hongkui Zhang, Yuan Tian, Qinghua Mao, Enqiao Zhang and Fandong Chen

Energies 2025, 18(19), 5122; https://doi.org/10.3390/en18195122 - 26 Sep 2025

Viewed by 337

Abstract

With the advancement of coal mine electrical equipment toward larger scale, higher complexity, and greater intelligence, traditional temperature rise monitoring methods have revealed critical limitations such as intrusive measurement, low spatial resolution, and delayed response. This study proposes a novel cross-media infrared measurement [...] Read more.

With the advancement of coal mine electrical equipment toward larger scale, higher complexity, and greater intelligence, traditional temperature rise monitoring methods have revealed critical limitations such as intrusive measurement, low spatial resolution, and delayed response. This study proposes a novel cross-media infrared measurement method combined with temperature rise characteristic analysis to overcome these challenges. First, a cross-media measurement principle is introduced, which uses the enclosure surface temperature as a proxy for the internal heat source temperature, thereby enabling non-invasive internal temperature rise measurement. Second, a non-contact, infrared thermography-based array-sensing measurement approach is adopted, facilitating the transition from traditional single-point temperature measurement to full-field thermal mapping with high spatial resolution. Furthermore, a multi-source data perception method is established by integrating infrared thermography with real-time operating current and ambient temperature, significantly enhancing the comprehensiveness and timeliness of thermal state monitoring. A hybrid prediction model combining Support Vector Regression (SVR) and Random Forest Regression (RFR) is developed, which effectively improves the prediction accuracy of the maximum internal temperature—particularly addressing the issue of weak surface temperature features during low heating stages. The experimental results demonstrate that the proposed method achieves high accuracy and stability across varying operating currents, with a root mean square error of 0.741 °C, a mean absolute error of 0.464 °C, and a mean absolute percentage error of 0.802%. This work provides an effective non-contact solution for real-time temperature rise monitoring and risk prevention in coal mine electrical equipment. Full article

► Show Figures

Figure 1

19 pages, 14968 KB

Open AccessArticle

Satellite-Ground Data Fusion for Hourly 5-km Gridded Human-Perceived Temperature Estimation in the Yangtze River Basin, China

by Huabing Ke, Zhongyuan Li, Zhaohua Liu and Zhaoliang Zeng

Remote Sens. 2025, 17(18), 3260; https://doi.org/10.3390/rs17183260 - 21 Sep 2025

Viewed by 438

Abstract

Human-perceived temperature (HPT) reflects the synergistic effects of multiple meteorological factors, and its extremes challenge human-managed and natural systems worldwide, especially in densely populated regions such as the Yangtze River Basin of China. However, detailed information on HPT at high temporal (e.g., hourly) [...] Read more.

Human-perceived temperature (HPT) reflects the synergistic effects of multiple meteorological factors, and its extremes challenge human-managed and natural systems worldwide, especially in densely populated regions such as the Yangtze River Basin of China. However, detailed information on HPT at high temporal (e.g., hourly) and spatial resolution is severely lacking. In this study, we conduct a collaborative inversion for 12 HPT indices at a ~5 km spatial resolution and an hourly temporal resolution in the Yangtze River Basin from multi-source data (e.g., Himawari-8 images, meteorological stations, ERA5-Land reanalysis, and DEM data) using the LightGBM model. The model exhibited high predictive accuracy across all indices, achieving an average coefficient of determination (R²) of 0.981, root mean square error (RMSE) of 1.150 °C, and mean absolute error (MAE) of 0.860 °C. These results aligned well with observational data across spatial and temporal scales, effectively capturing the spatial heterogeneity and diurnal evolution of the region’s thermal environment. Our research provides a reliable data foundation for heat-health risk assessment and regional climate adaptation strategies. Full article

► Show Figures

Graphical abstract

18 pages, 3356 KB

Open AccessArticle

Performance Comparison of Deep Learning Models for Predicting Fire-Induced Deformation in Sandwich Roof Panels

by Bohyuk Lim and Minkoo Kim

Fire 2025, 8(9), 368; https://doi.org/10.3390/fire8090368 - 18 Sep 2025

Viewed by 400

Abstract

Sandwich panels are widely used in industrial roofing due to their lightweight and thermal insulation properties; however, their structural fire resistance remains insufficiently understood. This study presents a data-driven approach to predict the mid-span deformation of glass wool-cored sandwich roof panels subjected to [...] Read more.

Sandwich panels are widely used in industrial roofing due to their lightweight and thermal insulation properties; however, their structural fire resistance remains insufficiently understood. This study presents a data-driven approach to predict the mid-span deformation of glass wool-cored sandwich roof panels subjected to ISO 834-5 standard fire tests. A total of 39 full-scale furnace tests were conducted, yielding 1519 data points that were utilized to develop deep learning models. Feature selection identified nine key predictors: elapsed time, panel orientation, and seven unexposed-surface temperatures. Three deep learning architectures—convolutional neural network (CNN), multilayer perceptron (MLP), and long short-term memory (LSTM)—were trained and evaluated through rigorous 5-fold cross-validation and independent external testing. Among them, the CNN approach consistently achieved the highest accuracy, with an average cross-validation performance of

R^{2} = 0.91 (mean absolute error (MAE) = 4.40; root mean square error (RMSE) = 6.42)

, and achieved

R^{2} = 0.76 (MAE = 6.52, RMSE = 8.62)

on the external test set. These results highlight the robustness of CNN in capturing spatially ordered thermal–structural interactions while also demonstrating the limitations of MLP and LSTM regarding the same experimental data. The findings provide a foundation for integrating machine learning into performance-based fire safety engineering and suggest that data-driven prediction can complement traditional fire-resistance assessments of sandwich roofing systems. Full article

(This article belongs to the Special Issue Current Advances in the Assessment and Mitigation of Fire Risk in Buildings and Urban Areas: 2nd Edition)

► Show Figures

Figure 1

21 pages, 6059 KB

Open AccessArticle

A Precision Measurement Method for Rooftop Photovoltaic Capacity Using Drone and Publicly Available Imagery

by Yue Hu, Yuce Liu, Yu Zhang, Hongwei Dong, Chongzheng Li, Hongzhi Mao, Fusong Wang and Meng Wang

Buildings 2025, 15(18), 3377; https://doi.org/10.3390/buildings15183377 - 17 Sep 2025

Viewed by 311

Abstract

Against the global backdrop of energy transition, the precise assessment of urban rooftop photovoltaic (PV) system capacity is recognized as crucial for optimizing the energy structure and enhancing the sustainable utilization efficiency of spatial resources. Publicly available aerial imagery is characterized by non-orthorectified [...] Read more.

Against the global backdrop of energy transition, the precise assessment of urban rooftop photovoltaic (PV) system capacity is recognized as crucial for optimizing the energy structure and enhancing the sustainable utilization efficiency of spatial resources. Publicly available aerial imagery is characterized by non-orthorectified issues; direct utilization is known to lead to geometric distortions in rooftop PV and errors in capacity prediction. To address this, a dual-optimization framework is proposed in this study, integrating monocular vision-based 3D reconstruction with a lightweight linear model. Leveraging the orthogonal characteristics of building structures, camera self-calibration and 3D reconstruction are achieved through geometric constraints imposed by vanishing points. Scale distortion is suppressed via the incorporation of a multi-dimensional geometric constraint error control strategy. Concurrently, a linear capacity-area model is constructed, thereby simplifying the complexity inherent in traditional multi-parameter fitting. Utilizing drone oblique photography and Google Earth public imagery, 3D reconstruction was performed for 20 PV-equipped buildings in Wuhan City. Two buildings possessing high-precision field survey data were selected as typical experimental subjects for validation. The results demonstrate that the 3D reconstruction method reduced the mean absolute percentage error (MAPE)—used here as an estimator of measurement uncertainty—of PV area identification from 10.58% (achieved by the 2D method) to 3.47%, while the coefficient of determination (R²) for the capacity model reached 0.9548. These results suggest that this methodology can provide effective technical support for low-cost, high-precision urban rooftop PV resource surveys. It has the potential to significantly enhance the reliability of energy planning data, thereby contributing to the efficient development of urban spatial resources and the achievement of sustainable energy transition goals. Full article

(This article belongs to the Special Issue Research on Solar Energy System and Storage for Sustainable Buildings)

► Show Figures

Figure 1

18 pages, 8138 KB

Open AccessArticle

Study of the Correlation Between Water Resource Changes and Drought Indices in the Yinchuan Plain Based on Multi-Source Remote Sensing and Deep Learning

by Hong Guan, Zhiguo Jiang, Jing Lu and Yukuai Wan

Water 2025, 17(18), 2740; https://doi.org/10.3390/w17182740 - 16 Sep 2025

Viewed by 346

Abstract

This study examines the intricate relationship between water resource dynamics and drought indices in the Yinchuan Plain, China, by integrating multi-source remote sensing data with advanced deep learning techniques. Using data from 2002 to 2022, we applied Long Short-Term Memory (LSTM) networks to [...] Read more.

This study examines the intricate relationship between water resource dynamics and drought indices in the Yinchuan Plain, China, by integrating multi-source remote sensing data with advanced deep learning techniques. Using data from 2002 to 2022, we applied Long Short-Term Memory (LSTM) networks to model the spatiotemporal dynamics of water resources and their relationships with the Standardized Precipitation Index (SPI), Standardized Precipitation Evapotranspiration Index (SPEI), and Palmer Drought Severity Index (PDSI). Our findings reveal a strong correlation between total water resources and the SPEI (r = 0.81, p < 0.001), underscoring the pivotal role of evapotranspiration in this region’s water balance. The LSTM model outperformed traditional statistical methods, achieving a Root Mean Square Error of 0.142 for water resource predictions and 0.118 for drought index forecasts. Spatial analysis indicated stronger correlations in the northern Yinchuan Plain, likely influenced by its proximity to the Yellow River and regional water management practices. Wavelet coherence analysis identified significant coherence at the 6–12-month scale, highlighting the importance of seasonal to inter-annual strategies for water resource management. These results provide a robust foundation for developing effective water management policies and drought mitigation strategies in arid and semi-arid regions. The methodologies presented are broadly applicable to similar water-scarce regions, contributing to global efforts in sustainable water resource management under changing climatic conditions. Full article

(This article belongs to the Section Water Resources Management, Policy and Governance)

► Show Figures

Figure 1

20 pages, 803 KB

Open AccessArticle

The Effective Highlight-Detection Model for Video Clips Using Spatial—Perceptual

by Sungshin Kwak, Jaedong Lee and Sohyun Park

Electronics 2025, 14(18), 3640; https://doi.org/10.3390/electronics14183640 - 15 Sep 2025

Viewed by 1074

Abstract

With the rapid growth of video platforms such as YouTube, Bilibili, and Dailymotion, an enormous amount of video content is being shared worldwide. In this environment, content providers are increasingly adopting methods that restructure videos around highlight scenes and distribute them in short-form [...] Read more.

With the rapid growth of video platforms such as YouTube, Bilibili, and Dailymotion, an enormous amount of video content is being shared worldwide. In this environment, content providers are increasingly adopting methods that restructure videos around highlight scenes and distribute them in short-form formats to encourage more efficient content consumption by viewers. As a result of this trend, the importance of highlight extraction technologies capable of automatically identifying key scenes from large-scale video datasets has been steadily increasing. To address this need, this study proposes SPOT (Spatial Perceptual Optimized TimeSformer), a highlight extraction model. The proposed model enhances spatial perceptual capability by integrating a CNN encoder into the internal structure of the existing Transformer-based TimeSformer, enabling simultaneous learning of both the local and global features of a video. The experiments were conducted using Google’s YT-8M video dataset along with the MR.Hisum dataset, which provides organized highlight information. The SPOT model adopts a regression-based highlight prediction framework. Experimental results on video datasets of varying complexity showed that, in the high-complexity group, the SPOT model achieved a reduction in mean squared error (MSE) of approximately 0.01 (from 0.090 to 0.080) compared to the original TimeSformer. Furthermore, the model outperformed the baseline across all complexity groups in terms of mAP, Coverage, and F1-Score metrics. These results suggest that the proposed model holds strong potential for diverse multimodal applications such as video summarization, content recommendation, and automated video editing. Moreover, it is expected to serve as a foundational technology for advancing video-based artificial intelligence systems in the future. Full article

(This article belongs to the Special Issue Image Processing Based on Convolution Neural Network: 2nd Edition)

► Show Figures

Figure 1

Search Results (1,044)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (1,044)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI