MDPI - Publisher of Open Access Journals

23 pages, 7371 KiB

Open AccessArticle

A Novel Method for Estimating Building Height from Baidu Panoramic Street View Images

by Shibo Ge, Jiping Liu, Xianghong Che, Yong Wang and Haosheng Huang

ISPRS Int. J. Geo-Inf. 2025, 14(8), 297; https://doi.org/10.3390/ijgi14080297 - 30 Jul 2025

Viewed by 150

Building height information plays an important role in many urban-related applications, such as urban planning, disaster management, and environmental studies. With the rapid development of real scene maps, street view images are becoming a new data source for building height estimation, considering their [...] Read more.

Building height information plays an important role in many urban-related applications, such as urban planning, disaster management, and environmental studies. With the rapid development of real scene maps, street view images are becoming a new data source for building height estimation, considering their easy collection and low cost. However, existing studies on building height estimation primarily utilize remote sensing images, with little exploration of height estimation from street-view images. In this study, we proposed a deep learning-based method for estimating the height of a single building in Baidu panoramic street view imagery. Firstly, the Segment Anything Model was used to extract the region of interest image and location features of individual buildings from the panorama. Subsequently, a cross-view matching algorithm was proposed by combining Baidu panorama and building footprint data with height information to generate building height samples. Finally, a Two-Branch feature fusion model (TBFF) was constructed to combine building location features and visual features, enabling accurate height estimation for individual buildings. The experimental results showed that the TBFF model had the best performance, with an RMSE of 5.69 m, MAE of 3.97 m, and MAPE of 0.11. Compared with two state-of-the-art methods, the TBFF model exhibited robustness and higher accuracy. The Random Forest model had an RMSE of 11.83 m, MAE of 4.76 m, and MAPE of 0.32, and the Pano2Geo model had an RMSE of 10.51 m, MAE of 6.52 m, and MAPE of 0.22. The ablation analysis demonstrated that fusing building location and visual features can improve the accuracy of height estimation by 14.98% to 69.99%. Moreover, the accuracy of the proposed method meets the LOD1 level 3D modeling requirements defined by the OGC (height error ≤ 5 m), which can provide data support for urban research. Full article

► Show Figures

Figure 1

24 pages, 12286 KiB

Open AccessArticle

A UAV-Based Multi-Scenario RGB-Thermal Dataset and Fusion Model for Enhanced Forest Fire Detection

by Yalin Zhang, Xue Rui and Weiguo Song

Remote Sens. 2025, 17(15), 2593; https://doi.org/10.3390/rs17152593 - 25 Jul 2025

Viewed by 363

Abstract

UAVs are essential for forest fire detection due to vast forest areas and inaccessibility of high-risk zones, enabling rapid long-range inspection and detailed close-range surveillance. However, aerial photography faces challenges like multi-scale target recognition and complex scenario adaptation (e.g., deformation, occlusion, lighting variations). [...] Read more.

UAVs are essential for forest fire detection due to vast forest areas and inaccessibility of high-risk zones, enabling rapid long-range inspection and detailed close-range surveillance. However, aerial photography faces challenges like multi-scale target recognition and complex scenario adaptation (e.g., deformation, occlusion, lighting variations). RGB-Thermal fusion methods integrate visible-light texture and thermal infrared temperature features effectively, but current approaches are constrained by limited datasets and insufficient exploitation of cross-modal complementary information, ignoring cross-level feature interaction. A time-synchronized multi-scene, multi-angle aerial RGB-Thermal dataset (RGBT-3M) with “Smoke–Fire–Person” annotations and modal alignment via the M-RIFT method was constructed as a way to address the problem of data scarcity in wildfire scenarios. Finally, we propose a CP-YOLOv11-MF fusion detection model based on the advanced YOLOv11 framework, which can learn heterogeneous features complementary to each modality in a progressive manner. Experimental validation proves the superiority of our method, with a precision of 92.5%, a recall of 93.5%, a mAP50 of 96.3%, and a mAP50-95 of 62.9%. The model’s RGB-Thermal fusion capability enhances early fire detection, offering a benchmark dataset and methodological advancement for intelligent forest conservation, with implications for AI-driven ecological protection. Full article

(This article belongs to the Special Issue Advances in Spectral Imagery and Methods for Fire and Smoke Detection)

► Show Figures

Figure 1

17 pages, 3823 KiB

Open AccessArticle

Lightweight UAV-Based System for Early Fire-Risk Identification in Wild Forests

by Akmalbek Abdusalomov, Sabina Umirzakova, Alpamis Kutlimuratov, Dilshod Mirzaev, Adilbek Dauletov, Tulkin Botirov, Madina Zakirova, Mukhriddin Mukhiddinov and Young Im Cho

Fire 2025, 8(8), 288; https://doi.org/10.3390/fire8080288 - 23 Jul 2025

Viewed by 372

Abstract

The escalating impacts and occurrence of wildfires threaten the public, economies, and global ecosystems. Physiologically declining or dead trees are a great portion of the fires because these trees are prone to higher ignition and have lower moisture content. To prevent wildfires, hazardous [...] Read more.

The escalating impacts and occurrence of wildfires threaten the public, economies, and global ecosystems. Physiologically declining or dead trees are a great portion of the fires because these trees are prone to higher ignition and have lower moisture content. To prevent wildfires, hazardous vegetation needs to be removed, and the vegetation should be identified early on. This work proposes a real-time fire risk tree detection framework using UAV images, which is based on lightweight object detection. The model uses the MobileNetV3-Small spine, which is optimized for edge deployment, combined with an SSD head. This configuration results in a highly optimized and fast UAV-based inference pipeline. The dataset used in this study comprises over 3000 annotated RGB UAV images of trees in healthy, partially dead, and fully dead conditions, collected from mixed real-world forest scenes and public drone imagery repositories. Thorough evaluation shows that the proposed model outperforms conventional SSD and recent YOLOs on Precision (94.1%), Recall (93.7%), mAP (90.7%), F1 (91.0%) while being light-weight (8.7 MB) and fast (62.5 FPS on Jetson Xavier NX). These findings strongly support the model’s effectiveness for large-scale continuous forest monitoring to detect health degradations and mitigate wildfire risks proactively. The framework UAV-based environmental monitoring systems differentiates itself by incorporating a balance between detection accuracy, speed, and resource efficiency as fundamental principles. Full article

► Show Figures

Figure 1

22 pages, 9071 KiB

Open AccessArticle

Integrating UAV-Based RGB Imagery with Semi-Supervised Learning for Tree Species Identification in Heterogeneous Forests

by Bingru Hou, Chenfeng Lin, Mengyuan Chen, Mostafa M. Gouda, Yunpeng Zhao, Yuefeng Chen, Fei Liu and Xuping Feng

Remote Sens. 2025, 17(15), 2541; https://doi.org/10.3390/rs17152541 - 22 Jul 2025

Viewed by 289

Abstract

The integration of unmanned aerial vehicle (UAV) remote sensing and deep learning has emerged as a highly effective strategy for inventorying forest resources. However, the spatiotemporal variability of forest environments and the scarcity of annotated data hinder the performance of conventional supervised deep-learning [...] Read more.

The integration of unmanned aerial vehicle (UAV) remote sensing and deep learning has emerged as a highly effective strategy for inventorying forest resources. However, the spatiotemporal variability of forest environments and the scarcity of annotated data hinder the performance of conventional supervised deep-learning models. To overcome these challenges, this study has developed efficient tree (ET), a semi-supervised tree detector designed for forest scenes. ET employed an enhanced YOLO model (YOLO-Tree) as a base detector and incorporated a teacher–student semi-supervised learning (SSL) framework based on pseudo-labeling, effectively leveraging abundant unlabeled data to bolster model robustness. The results revealed that SSL significantly improved outcomes in scenarios with sparse labeled data, specifically when the annotation proportion was below 50%. Additionally, employing overlapping cropping as a data augmentation strategy mitigated instability during semi-supervised training under conditions of limited sample size. Notably, introducing unlabeled data from external sites enhances the accuracy and cross-site generalization of models trained on diverse datasets, achieving impressive results with F1, mAP50, and mAP50-95 scores of 0.979, 0.992, and 0.871, respectively. In conclusion, this study highlights the potential of combining UAV-based RGB imagery with SSL to advance tree species identification in heterogeneous forests. Full article

(This article belongs to the Special Issue Remote Sensing-Assisted Forest Inventory Planning)

► Show Figures

Figure 1

16 pages, 539 KiB

Open AccessArticle

Virtual Reality as a Non-Pharmacological Aid for Reducing Anxiety in Pediatric Dental Procedures

by Laria-Maria Trusculescu, Dana Emanuela Pitic, Andreea Sălcudean, Ramona Amina Popovici, Norina Forna, Silviu Constantin Badoiu, Alexandra Enache, Sorina Enasoni, Andreea Kiș, Raluca Mioara Cosoroabă, Cristina Ioana Talpos-Niculescu, Corneliu Constantin Zeicu, Maria-Melania Cozma and Liana Todor

Children 2025, 12(7), 930; https://doi.org/10.3390/children12070930 - 14 Jul 2025

Viewed by 292

Abstract

Background/Objectives: Dental anxiety in children is a common issue that can hinder the delivery of effective dental care. Traditional approaches to managing this are often insufficient or involve pharmacological interventions. This study shows the potential of virtual reality (VR) to aid in reducing [...] Read more.

Background/Objectives: Dental anxiety in children is a common issue that can hinder the delivery of effective dental care. Traditional approaches to managing this are often insufficient or involve pharmacological interventions. This study shows the potential of virtual reality (VR) to aid in reducing anxiety in children undergoing simple dental procedures. By immersing children in relaxing VR environments (such as beaches, forests, mountains, or underwater scenes with calm music), the objective is to assess VR’s effectiveness in calming pediatrics patients during these procedures. Methods: Children scheduled for minor dental treatments wore a wearable device that monitored pulse, perspiration, and stress levels. Each child’s baseline data was collected without the VR headset, followed by data collection during VR exposure before and during dental procedures. VR scenarios ranged from soothing nature scenes to animated cartoons, designed to foster relaxation. Results: The data collected showed a reduction in physiological indicators of stress, such as lower heart rate and reduced perspiration, when the VR headset was used. Children appeared more relaxed, with a calmer response during the procedure itself, compared to baseline levels without VR. Conclusions: This study provides preliminary evidence supporting VR as an effective tool for reducing anxiety and stress in pediatric dental patients. By offering an engaging, immersive experience, VR can serve as an alternative or complementary approach to traditional anxiety management strategies in pediatric dentistry, potentially improving patient comfort and cooperation during dental procedures. Further research could determine if VR may serve as an alternative to local anesthesia for non-intrusive pediatric dental procedures. Full article

(This article belongs to the Special Issue Children’s Behaviour and Social-Emotional Competence)

► Show Figures

Figure 1

24 pages, 4442 KiB

Open AccessArticle

Time-Series Correlation Optimization for Forest Fire Tracking

by Dongmei Yang, Guohao Nie, Xiaoyuan Xu, Debin Zhang and Xingmei Wang

Forests 2025, 16(7), 1101; https://doi.org/10.3390/f16071101 - 3 Jul 2025

Viewed by 301

Abstract

Accurate real-time tracking of forest fires using UAV platforms is crucial for timely early warning, reliable spread prediction, and effective autonomous suppression. Existing detection-based multi-object tracking methods face challenges in accurately associating targets and maintaining smooth tracking trajectories in complex forest environments. These [...] Read more.

Accurate real-time tracking of forest fires using UAV platforms is crucial for timely early warning, reliable spread prediction, and effective autonomous suppression. Existing detection-based multi-object tracking methods face challenges in accurately associating targets and maintaining smooth tracking trajectories in complex forest environments. These difficulties stem from the highly nonlinear movement of flames relative to the observing UAV and the lack of robust fire-specific feature modeling. To address these challenges, we introduce AO-OCSORT, an association-optimized observation-centric tracking framework designed to enhance robustness in dynamic fire scenarios. AO-OCSORT builds on the YOLOX detector. To associate detection results across frames and form smooth trajectories, we propose a temporal–physical similarity metric that utilizes temporal information from the short-term motion of targets and incorporates physical flame characteristics derived from optical flow and contours. Subsequently, scene classification and low-score filtering are employed to develop a hierarchical association strategy, reducing the impact of false detections and interfering objects. Additionally, a virtual trajectory generation module is proposed, employing a kinematic model to maintain trajectory continuity during flame occlusion. Locally evaluated on the 1080P-resolution FireMOT UAV wildfire dataset, AO-OCSORT achieves a 5.4% improvement in MOTA over advanced baselines at 28.1 FPS, meeting real-time requirements. This improvement enhances the reliability of fire front localization, which is crucial for forest fire management. Furthermore, AO-OCSORT demonstrates strong generalization, achieving 41.4% MOTA on VisDrone, 80.9% on MOT17, and 92.2% MOTA on DanceTrack. Full article

(This article belongs to the Special Issue Advanced Technologies for Forest Fire Detection and Monitoring)

► Show Figures

Figure 1

27 pages, 13245 KiB

Open AccessArticle

LHRF-YOLO: A Lightweight Model with Hybrid Receptive Field for Forest Fire Detection

by Yifan Ma, Weifeng Shan, Yanwei Sui, Mengyu Wang and Maofa Wang

Forests 2025, 16(7), 1095; https://doi.org/10.3390/f16071095 - 2 Jul 2025

Viewed by 346

Abstract

Timely and accurate detection of forest fires is crucial for protecting forest ecosystems. However, traditional monitoring methods face significant challenges in effectively detecting forest fires, primarily due to the dynamic spread of flames and smoke, irregular morphologies, and the semi-transparent nature of smoke, [...] Read more.

Timely and accurate detection of forest fires is crucial for protecting forest ecosystems. However, traditional monitoring methods face significant challenges in effectively detecting forest fires, primarily due to the dynamic spread of flames and smoke, irregular morphologies, and the semi-transparent nature of smoke, which make it extremely difficult to extract key visual features. Additionally, deploying these detection systems to edge devices with limited computational resources remains challenging. To address these issues, this paper proposes a lightweight hybrid receptive field model (LHRF-YOLO), which leverages deep learning to overcome the shortcomings of traditional monitoring methods for fire detection on edge devices. Firstly, a hybrid receptive field extraction module is designed by integrating the 2D selective scan mechanism with a residual multi-branch structure. This significantly enhances the model’s contextual understanding of the entire image scene while maintaining low computational complexity. Second, a dynamic enhanced downsampling module is proposed, which employs feature reorganization and channel-wise dynamic weighting strategies to minimize the loss of critical details, such as fine smoke textures, while reducing image resolution. Furthermore, a scale weighted Fusion module is introduced to optimize multi-scale feature fusion through adaptive weight allocation, addressing the issues of information dilution and imbalance caused by traditional fusion methods. Finally, the Mish activation function replaces the SiLU activation function to improve the model’s ability to capture flame edges and faint smoke textures. Experimental results on the self-constructed Fire-SmokeDataset demonstrate that LHRF-YOLO achieves significant model compression while further improving accuracy compared to the baseline model YOLOv11. The parameter count is reduced to only 2.25M (a 12.8% reduction), computational complexity to 5.4 GFLOPs (a 14.3% decrease), and mAP50 is increased to 87.6%, surpassing the baseline model. Additionally, LHRF-YOLO exhibits leading generalization performance on the cross-scenario M4SFWD dataset. The proposed method balances performance and resource efficiency, providing a feasible solution for real-time and efficient fire detection on resource-constrained edge devices with significant research value. Full article

(This article belongs to the Special Issue Forest Fires Prediction and Detection—2nd Edition)

► Show Figures

Figure 1

29 pages, 3799 KiB

Open AccessArticle

Forest Three-Dimensional Reconstruction Method Based on High-Resolution Remote Sensing Image Using Tree Crown Segmentation and Individual Tree Parameter Extraction Model

by Guangsen Ma, Gang Yang, Hao Lu and Xue Zhang

Remote Sens. 2025, 17(13), 2179; https://doi.org/10.3390/rs17132179 - 25 Jun 2025

Viewed by 406

Abstract

Efficient and accurate acquisition of tree distribution and three-dimensional geometric information in forest scenes, along with three-dimensional reconstructions of entire forest environments, hold significant application value in precision forestry and forestry digital twins. However, due to complex vegetation structures, fine geometric details, and [...] Read more.

Efficient and accurate acquisition of tree distribution and three-dimensional geometric information in forest scenes, along with three-dimensional reconstructions of entire forest environments, hold significant application value in precision forestry and forestry digital twins. However, due to complex vegetation structures, fine geometric details, and severe occlusions in forest environments, existing methods—whether vision-based or LiDAR-based—still face challenges such as high data acquisition costs, feature extraction difficulties, and limited reconstruction accuracy. This study focuses on reconstructing tree distribution and extracting key individual tree parameters, and it proposes a forest 3D reconstruction framework based on high-resolution remote sensing images. Firstly, an optimized Mask R-CNN model was employed to segment individual tree crowns and extract distribution information. Then, a Tree Parameter and Reconstruction Network (TPRN) was constructed to directly estimate key structural parameters (height, DBH etc.) from crown images and generate tree 3D models. Subsequently, the 3D forest scene could be reconstructed by combining the distribution information and tree 3D models. In addition, to address the data scarcity, a hybrid training strategy integrating virtual and real data was proposed for crown segmentation and individual tree parameter estimation. Experimental results demonstrated that the proposed method could reconstruct an entire forest scene within seconds while accurately preserving tree distribution and individual tree attributes. In two real-world plots, the tree counting accuracy exceeded 90%, with an average tree localization error under 0.2 m. The TPRN achieved parameter extraction accuracies of 92.7% and 96% for tree height, and 95.4% and 94.1% for DBH. Furthermore, the generated individual tree models achieved average Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) scores of 11.24 and 0.53, respectively, validating the quality of the reconstruction. This approach enables fast and effective large-scale forest scene reconstruction using only a single remote sensing image as input, demonstrating significant potential for applications in both dynamic forest resource monitoring and forestry-oriented digital twin systems. Full article

(This article belongs to the Special Issue Digital Modeling for Sustainable Forest Management)

► Show Figures

Figure 1

19 pages, 2832 KiB

Open AccessArticle

High Spatial Resolution Soil Moisture Mapping over Agricultural Field Integrating SMAP, IMERG, and Sentinel-1 Data in Machine Learning Models

by Diego Tola, Lautaro Bustillos, Fanny Arragan, Rene Chipana, Renaud Hostache, Eléonore Resongles, Raúl Espinoza-Villar, Ramiro Pillco Zolá, Elvis Uscamayta, Mayra Perez-Flores and Frédéric Satgé

Remote Sens. 2025, 17(13), 2129; https://doi.org/10.3390/rs17132129 - 21 Jun 2025

Viewed by 1866

Abstract

Soil moisture content (SMC) is a critical parameter for agricultural productivity, particularly in semi-arid regions, where irrigation practices are extensively used to offset water deficits and ensure decent yields. Yet, the socio-economic and remote context of these regions prevents sufficiently dense SMC monitoring [...] Read more.

Soil moisture content (SMC) is a critical parameter for agricultural productivity, particularly in semi-arid regions, where irrigation practices are extensively used to offset water deficits and ensure decent yields. Yet, the socio-economic and remote context of these regions prevents sufficiently dense SMC monitoring in space and time to support farmers in their work to avoid unsustainable irrigation practices and preserve water resource availability. In this context, our study addresses the challenge of high spatial resolution (i.e., 20 m) SMC estimation by integrating remote sensing datasets in machine learning models. For this purpose, a dataset made of 166 soil samples’ SMC along with corresponding SMC, precipitation, and radar signal derived from Soil Moisture Active Passive (SMAP), Integrated Multi-satellitE Retrievals for GPM (IMERG), and Sentinel-1 (S1), respectively, was used to assess four machine learning models’ (Decision Tree—DT, Random Forest—RF, Gradient Boosting—GB, Extreme Gradient Boosting—XGB) reliability for SMC mapping. First, each model was trained/validated using only the coarse spatial resolution (i.e., 10 km) SMAP SMC and IMERG precipitation estimates as independent features, and, second, S1 information (i.e., 20 m) derived from single scenes and/or composite images was added as independent features to highlight the benefit of information (i.e., S1 information) for SMC mapping at high spatial resolution (i.e., 20 m). Results show that integrating S1 information from both single scenes and composite images to SMAP SMC and IMERG precipitation data significantly improves model reliability, as R² increased by 12% to 16%, while RMSE decreased by 10% to 18%, depending on the considered model (i.e., RF, XGB, DT, GB). Overall, all models provided reliable SMC estimates at 20 m spatial resolution, with the GB model performing the best (R² = 0.86, RMSE = 2.55%). Full article

(This article belongs to the Special Issue Remote Sensing for Soil Properties and Plant Ecosystems)

► Show Figures

Figure 1

18 pages, 4774 KiB

Open AccessArticle

InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset

by Yuandong Niu, Limin Liu, Fuyu Huang, Juntao Ma, Chaowen Zheng, Yunfeng Jiang, Ting An, Zhongchen Zhao and Shuangyou Chen

Remote Sens. 2025, 17(12), 2035; https://doi.org/10.3390/rs17122035 - 13 Jun 2025

Viewed by 450

Abstract

In fields such as military reconnaissance, forest fire prevention, and autonomous driving at night, there is an urgent need for high-precision three-dimensional reconstruction in low-light or night environments. The acquisition of remote sensing data by RGB cameras relies on external light, resulting in [...] Read more.

In fields such as military reconnaissance, forest fire prevention, and autonomous driving at night, there is an urgent need for high-precision three-dimensional reconstruction in low-light or night environments. The acquisition of remote sensing data by RGB cameras relies on external light, resulting in a significant decline in image quality and making it difficult to meet the task requirements. The method based on lidar has poor imaging effects in rainy and foggy weather, close-range scenes, and scenarios requiring thermal imaging data. In contrast, infrared cameras can effectively overcome this challenge because their imaging mechanisms are different from those of RGB cameras and lidar. However, the research on three-dimensional scene reconstruction of infrared images is relatively immature, especially in the field of infrared binocular stereo matching. There are two main challenges given this situation: first, there is a lack of a dataset specifically for infrared binocular stereo matching; second, the lack of texture information in infrared images causes a limit in the extension of the RGB method to the infrared reconstruction problem. To solve these problems, this study begins with the construction of an infrared binocular stereo matching dataset and then proposes an innovative perspective projection positional encoding-based transformer method to complete the infrared binocular stereo matching task. In this paper, a stereo matching network combined with transformer and cost volume is constructed. The existing work in the positional encoding of the transformer usually uses a parallel projection model to simplify the calculation. Our method is based on the actual perspective projection model so that each pixel is associated with a different projection ray. It effectively solves the problem of feature extraction and matching caused by insufficient texture information in infrared images and significantly improves matching accuracy. We conducted experiments based on the infrared binocular stereo matching dataset proposed in this paper. Experiments demonstrated the effectiveness of the proposed method. Full article

(This article belongs to the Collection Visible Infrared Imaging Radiometers and Applications)

► Show Figures

Figure 1

19 pages, 3237 KiB

Open AccessArticle

Therapeutic Potentials of Virtual Blue Spaces: A Study on the Physiological and Psychological Health Benefits of Virtual Waterscapes

by Su-Hsin Lee, Yi-Chien Chu, Li-Wen Wang and Shu-Chen Tsai

Healthcare 2025, 13(11), 1353; https://doi.org/10.3390/healthcare13111353 - 5 Jun 2025

Viewed by 748

Abstract

Background: Physical and mental health issues are increasingly becoming a global focus of attention, and telemedicine is widely attracting academic interest. Objectives: This exploratory study aimed to investigate the therapeutic potential of immersive virtual blue spaces for individuals with distinct lifestyle backgrounds—specifically, office [...] Read more.

Background: Physical and mental health issues are increasingly becoming a global focus of attention, and telemedicine is widely attracting academic interest. Objectives: This exploratory study aimed to investigate the therapeutic potential of immersive virtual blue spaces for individuals with distinct lifestyle backgrounds—specifically, office workers and retirees. The research explores how different virtual waterscapes influence emotional and physiological states in populations with varying stress profiles and life rhythms. Methods: A mixed-methods design was employed, combining quantitative measurements with qualitative interviews. In September 2023, forty participants (20 office workers and 20 retirees) from Hualien, Taiwan, were exposed to 360° VR simulations of three blue environments: a forest stream, a forest waterfall, and a beach scene. Pre- and post-session assessments included physiological indicators (blood pressure and heart rate) and emotional states measured using the Profile of Mood States (POMS) scale. Results: Significant physiological relaxation was observed among retirees. Office workers demonstrated greater emotional improvements, with noticeable variation depending on the type of virtual environment. Comparative analysis highlighted the stream landscape’s unique benefit for reducing depression and enhancing positive mood states. Thematic findings from post-session interviews further indicated that emotional responses were moderated by individual background and prior emotional experiences. Conclusions: These findings underscore the short-term therapeutic potential of virtual blue spaces for diverse user groups and reveal the influence of personal context on their effectiveness. The study supports the integration of VR-based nature exposure into personalized digital healthcare interventions and offers a foundation for future development in immersive therapeutic technologies. Full article

► Show Figures

Figure 1

28 pages, 16050 KiB

Open AccessArticle

Advancing ALS Applications with Large-Scale Pre-Training: Framework, Dataset, and Downstream Assessment

by Haoyi Xiu, Xin Liu, Taehoon Kim and Kyoung-Sook Kim

Remote Sens. 2025, 17(11), 1859; https://doi.org/10.3390/rs17111859 - 27 May 2025

Viewed by 495

Abstract

The pre-training and fine-tuning paradigm has significantly advanced satellite remote sensing applications. However, its potential remains largely underexplored for airborne laser scanning (ALS), a key technology in domains such as forest management and urban planning. In this study, we address this gap by [...] Read more.

The pre-training and fine-tuning paradigm has significantly advanced satellite remote sensing applications. However, its potential remains largely underexplored for airborne laser scanning (ALS), a key technology in domains such as forest management and urban planning. In this study, we address this gap by constructing a large-scale ALS point cloud dataset and evaluating its effectiveness in downstream applications. We first propose a simple, generalizable framework for dataset construction, designed to maximize land cover and terrain diversity while allowing flexible control over dataset size. We instantiate this framework using ALS, land cover, and terrain data collected across the contiguous United States, resulting in a dataset geographically covering 17,000 +

{km}^{2}

(184 billion points) with diverse land cover and terrain types included. As a baseline self-supervised learning model, we adopt BEV-MAE, a state-of-the-art masked autoencoder for 3D outdoor point clouds, and pre-train it on the constructed dataset. The resulting models are fine-tuned for several downstream tasks, including tree species classification, terrain scene recognition, and point cloud semantic segmentation. Our results show that pre-trained models consistently outperform their counterparts trained from scratch across all downstream tasks, demonstrating the strong transferability of the learned representations. Additionally, we find that scaling the dataset using the proposed framework leads to consistent performance improvements, whereas datasets constructed via random sampling fail to achieve comparable gains. Full article

► Show Figures

Figure 1

22 pages, 25979 KiB

Open AccessFeature PaperArticle

Advancing Early Wildfire Detection: Integration of Vision Language Models with Unmanned Aerial Vehicle Remote Sensing for Enhanced Situational Awareness

by Leon Seidel, Simon Gehringer, Tobias Raczok, Sven-Nicolas Ivens, Bernd Eckardt and Martin Maerz

Drones 2025, 9(5), 347; https://doi.org/10.3390/drones9050347 - 3 May 2025

Viewed by 1627

Abstract

Early wildfire detection is critical for effective suppression efforts, necessitating rapid alerts and precise localization. While computer vision techniques offer reliable fire detection, they often lack contextual understanding. This paper addresses this limitation by utilizing Vision Language Models (VLMs) to generate structured scene [...] Read more.

Early wildfire detection is critical for effective suppression efforts, necessitating rapid alerts and precise localization. While computer vision techniques offer reliable fire detection, they often lack contextual understanding. This paper addresses this limitation by utilizing Vision Language Models (VLMs) to generate structured scene descriptions from Unmanned Aerial Vehicle (UAV) imagery. UAV-based remote sensing provides diverse perspectives for potential wildfires, and state-of-the-art VLMs enable rapid and detailed scene characterization. We evaluated both cloud-based (OpenAI, Google DeepMind) and open-weight, locally deployed VLMs on a novel evaluation dataset specifically curated for understanding forest fire scenes. Our results demonstrate that relatively compact, fine-tuned VLMs can provide rich contextual information, including forest type, fire state, and fire type. Specifically, our best-performing model, ForestFireVLM-7B (fine-tuned from Qwen2-5-VL-7B), achieved a 76.6% average accuracy across all categories, surpassing the strongest closed-weight baseline (Gemini 2.0 Pro at 65.5%). Furthermore, zero-shot evaluation on the publicly available FIgLib dataset demonstrated state-of-the-art smoke detection accuracy using VLMs. Our findings highlight the potential of fine-tuned, open-weight VLMs for enhanced wildfire situational awareness via detailed scene interpretation. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) and Machine Learning (ML) in UAV Technology)

► Show Figures

Figure 1

16 pages, 4821 KiB

Open AccessArticle

Pilot Performance Testing of a Battery-Powered Salamander Micro-Skidder in Timber Harvesting

by Grzegorz Szewczyk, Jozef Krilek, Paweł Tylek, Ján Hanes, Slavomír Petrenec, Miłosz Szczepańczyk and Dominik Józefczyk

Forests 2025, 16(5), 753; https://doi.org/10.3390/f16050753 - 28 Apr 2025

Viewed by 489

Abstract

The objective of our research was to ascertain the time intensity of timber skidding with a prototype ATV Salamander 600 4 × 4 micro-skidder and to characterize the operator’s field of view. The time intensity of skidding amounts to approximately 20 min/m³ [...] Read more.

The objective of our research was to ascertain the time intensity of timber skidding with a prototype ATV Salamander 600 4 × 4 micro-skidder and to characterize the operator’s field of view. The time intensity of skidding amounts to approximately 20 min/m³ at a distance of 20 m when skidding timber from the forest stand and approximately 10 min/m³ when skidding along the skid trail for a distance of 80 m, which is comparable to other machines of this type, despite reported problems with raw material causing jamming on rugged terrain in the first phase of the skidding process. The significant discrepancy (6%) in wheel slippage between the front and rear axles was particularly pronounced during the process of pulling timber up to the skid trail. This can be attributed to the transport hitch being positioned excessively high, thereby relieving the force on the hitch and causing the front axle to be affected. The observed difficulties in skidding resulted in the need to scan a wide visual scene when working in the stand. The initial phase of timber skidding in the forest stand exhibited a deficiency in its smooth flow, which led to an augmentation in mental workload, as indicated by the elongation of saccades. On average, these saccades were approximately 80% longer compared to those in work conducted on the skid trail. Full article

(This article belongs to the Section Forest Operations and Engineering)

► Show Figures

Figure 1

25 pages, 15523 KiB

Open AccessArticle

Comparative Analysis of Novel View Synthesis and Photogrammetry for 3D Forest Stand Reconstruction and Extraction of Individual Tree Parameters

by Guoji Tian, Chongcheng Chen and Hongyu Huang

Remote Sens. 2025, 17(9), 1520; https://doi.org/10.3390/rs17091520 - 25 Apr 2025

Cited by 1 | Viewed by 983

Abstract

The accurate and efficient 3D reconstruction of trees is beneficial for urban forest resource assessment and management. Close-range photogrammetry (CRP) is widely used in the 3D model reconstruction of forest scenes. However, in practical forestry applications, challenges such as low reconstruction efficiency and [...] Read more.

The accurate and efficient 3D reconstruction of trees is beneficial for urban forest resource assessment and management. Close-range photogrammetry (CRP) is widely used in the 3D model reconstruction of forest scenes. However, in practical forestry applications, challenges such as low reconstruction efficiency and poor reconstruction quality persist. Recently, novel view synthesis (NVS) technology, such as neural radiance fields (NeRF) and 3D Gaussian splatting (3DGS), has shown great potential in the 3D reconstruction of plants using some limited number of images. However, existing research typically focuses on small plants in orchards or individual trees. It remains uncertain whether this technology can be effectively applied in larger, more complex stands or forest scenes. In this study, we collected sequential images of urban forest plots with varying levels of complexity using imaging devices with different resolutions (cameras on smartphones and UAV). These plots included one with sparse, leafless trees and another with dense foliage and more occlusions. We then performed dense reconstruction of forest stands using NeRF and 3DGS methods. The resulting point cloud models were compared with those obtained through photogrammetric reconstruction and laser scanning methods. The results show that compared to photogrammetric method, NVS methods have a significant advantage in reconstruction efficiency. The photogrammetric method is suitable for relatively simple forest stands, as it is less adaptable to complex ones. This results in tree point cloud models with issues such as excessive canopy noise and wrongfully reconstructed trees with duplicated trunks and canopies. In contrast, NeRF is better adapted to more complex forest stands, yielding tree point clouds of the highest quality that offer more detailed trunk and canopy information. However, it can lead to reconstruction errors in the ground area when the input views are limited. The 3DGS method has a relatively poor capability to generate dense point clouds, resulting in models with low point density, particularly with sparse points in the trunk areas, which affects the accuracy of the diameter at breast height (DBH) estimation. Tree height and crown diameter information can be extracted from the point clouds reconstructed by all three methods, with NeRF achieving the highest accuracy in tree height. However, the accuracy of DBH extracted from photogrammetric point clouds is still higher than that from NeRF point clouds. Meanwhile, compared to ground-level smartphone images, tree parameters extracted from reconstruction results of higher-resolution and varied perspectives of drone images are more accurate. These findings confirm that NVS methods have significant application potential for 3D reconstruction of urban forests. Full article

(This article belongs to the Section AI Remote Sensing)

► Show Figures

Figure 1

Search Results (380)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (380)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI