MDPI - Publisher of Open Access Journals

19 pages, 3374 KiB

Open AccessArticle

The Influence of Viewing Geometry on Hyperspectral-Based Soil Property Retrieval

by Yucheng Gao, Lixia Ma, Zhongqi Zhang, Xianzhang Pan, Ziran Yuan, Changkun Wang and Dongsheng Yu

Remote Sens. 2025, 17(14), 2510; https://doi.org/10.3390/rs17142510 - 18 Jul 2025

Viewed by 176

Abstract

Hyperspectral technology has been widely applied to the retrieval of soil properties, such as soil organic matter (SOM) and particle size distribution (PSD). However, most previous studies have focused on hyperspectral data acquired from the nadir direction, and the influence of viewing geometry [...] Read more.

Hyperspectral technology has been widely applied to the retrieval of soil properties, such as soil organic matter (SOM) and particle size distribution (PSD). However, most previous studies have focused on hyperspectral data acquired from the nadir direction, and the influence of viewing geometry on hyperspectral-based soil property retrieval remains unclear. In this study, bidirectional reflectance factors (BRFs) were collected at 48 different viewing angles for 154 soil samples with varying SOM contents and PSDs. SOM and PSD were then retrieved using combinations of ten spectral preprocessing methods (raw reflectance, Savitzky–Golay filter (SG), first derivative (D1), second derivative (D2), standard normal variate (SNV), multiplicative scatter correction (MSC), SG + D1, SG + D2, SG + SNV, and SG + MSC), one sensitive wavelength selection method, and three retrieval algorithms (partial least squares regression (PLSR), support vector machine (SVM), and convolutional neural networks (CNNs)). The influence of viewing geometry on the selection of spectral preprocessing methods, retrieval algorithms, sensitive wavelengths, and retrieval accuracy was systematically analyzed. The results showed that soil BRFs are influenced by both soil properties and viewing angles. The viewing geometry had limited effects on the choice of preprocessing method and retrieval algorithm. Among the preprocessing methods, D1, SG + D1, and SG + D2 outperformed the others, while PLSR achieved a higher accuracy than SVM and CNN when retrieving soil properties. The selected sensitive wavelengths for both SOM and PSD varied slightly with viewing angle and were mainly located in the near-infrared region when using BRFs from multiple viewing angles. Compared with single-angle data, multi-angle BRFs significantly improved retrieval performance, with the R² increasing by 11% and 15%, and RMSE decreasing by 16% and 30% for SOM and PSD, respectively. The optimal viewing zenith angle ranged from 10° to 20° for SOM and around 40° for PSD. Additionally, backward viewing directions were more favorable than forward directions, with the optimal viewing azimuth angles being 0° for SOM and 90° for PSD. These findings provide useful insights for improving the accuracy of soil property retrieval using multi-angle hyperspectral observations. Full article

(This article belongs to the Section Remote Sensing in Agriculture and Vegetation)

► Show Figures

Figure 1

21 pages, 2471 KiB

Open AccessArticle

Attention-Based Mask R-CNN Enhancement for Infrared Image Target Segmentation

by Liang Wang and Kan Ren

Symmetry 2025, 17(7), 1099; https://doi.org/10.3390/sym17071099 - 9 Jul 2025

Viewed by 343

Abstract

Image segmentation is an important method in the field of image processing, while infrared (IR) image segmentation is one of the challenges in this field due to the unique characteristics of IR data. Infrared imaging utilizes the infrared radiation emitted by objects to [...] Read more.

Image segmentation is an important method in the field of image processing, while infrared (IR) image segmentation is one of the challenges in this field due to the unique characteristics of IR data. Infrared imaging utilizes the infrared radiation emitted by objects to produce images, which can supplement the performance of visible-light images under adverse lighting conditions to some extent. However, the low spatial resolution and limited texture details in IR images hinder the achievement of high-precision segmentation. To address these issues, an attention mechanism based on symmetrical cross-channel interaction—motivated by symmetry principles in computer vision—was integrated into a Mask Region-Based Convolutional Neural Network (Mask R-CNN) framework. A Bottleneck-enhanced Squeeze-and-Attention (BNSA) module was incorporated into the backbone network, and novel loss functions were designed for both the bounding box (Bbox) regression and mask prediction branches to enhance segmentation performance. Furthermore, a dedicated infrared image dataset was constructed to validate the proposed method. The experimental results demonstrate that the optimized model achieves higher segmentation accuracy and better segmentation performance compared to the original network and other mainstream segmentation models on our dataset, demonstrating how symmetrical design principles can effectively improve complex vision tasks. Full article

(This article belongs to the Special Issue Symmetry and Its Applications in Computer Vision)

► Show Figures

Figure 1

23 pages, 37536 KiB

Open AccessArticle

Underwater Sound Speed Profile Inversion Based on Res-SACNN from Different Spatiotemporal Dimensions

by Jiru Wang, Fangze Xu, Yuyao Liu, Yu Chen and Shu Liu

Remote Sens. 2025, 17(13), 2293; https://doi.org/10.3390/rs17132293 - 4 Jul 2025

Viewed by 270

Abstract

The sound speed profile (SSP) is an important feature in the field of ocean acoustics. The accurate estimation of SSP is significant for the development of underwater position, communication, and associated fundamental marine research. The Res-SACNN model is proposed for SSP inversion based [...] Read more.

The sound speed profile (SSP) is an important feature in the field of ocean acoustics. The accurate estimation of SSP is significant for the development of underwater position, communication, and associated fundamental marine research. The Res-SACNN model is proposed for SSP inversion based on the convolutional neural network (CNN) embedded with the residual network and self-attention mechanism. It combines the spatiotemporal characteristics of sea level anomaly (SLA) and sea surface temperature anomaly (SSTA) data and establishes a nonlinear relationship between satellite remote sensing data and sound speed field by deep learning. The single empirical orthogonal function regression (sEOF-r) method is used in a comparative experiment to confirm the model’s performance in both the time domain and the region. Experimental results demonstrate that the proposed model outperforms sEOF-r regarding both spatiotemporal generalization ability and inversion accuracy. The average root mean square error (RMSE) is decreased by 0.92 m/s in the time-domain experiment in the South China Sea, and the inversion results for each month are more consistent. The optimization ratio hits 71.8% and the average RMSE decreases by 7.39 m/s in the six-region experiment. The Res-SACNN model not only shows more superior inversion ability in the comparison with other deep-learning models, but also achieves strong generalization and real-time performance while maintaining low complexity, providing an improved technical tool for SSP estimation and sound field perception. Full article

► Show Figures

Figure 1

30 pages, 6733 KiB

Open AccessArticle

Forecasting Electric Vehicle Charging Demand in Smart Cities Using Hybrid Deep Learning of Regional Spatial Behaviours

by Muhammed Cavus, Huseyin Ayan, Dilum Dissanayake, Anurag Sharma, Sanchari Deb and Margaret Bell

Energies 2025, 18(13), 3425; https://doi.org/10.3390/en18133425 - 29 Jun 2025

Viewed by 379

Abstract

This study presents a novel predictive framework for estimating electric vehicle (EV) charging demand in smart cities, contributing to the advancement of data-driven infrastructure planning through behavioural and spatial data analysis. Motivated by the accelerating regional demand accompanying EV adoption, this work introduces [...] Read more.

This study presents a novel predictive framework for estimating electric vehicle (EV) charging demand in smart cities, contributing to the advancement of data-driven infrastructure planning through behavioural and spatial data analysis. Motivated by the accelerating regional demand accompanying EV adoption, this work introduces HCB-Net: a hybrid deep learning model that combines Convolutional Neural Networks (CNNs) for spatial feature extraction with Extreme Gradient Boosting (XGBoost) for robust regression. The framework is trained on user-level survey data from two demographically distinct UK regions, the West Midlands and the North East, incorporating user demographics, commute distance, charging frequency, and home/public charging preferences. HCB-Net achieved superior predictive performance, with a Root Mean Squared Error (RMSE) of 0.1490 and an

R^{2}

score of 0.3996. Compared to the best-performing traditional model (Linear Regression,

R^{2} = 0.3520

), HCB-Net improved predictive accuracy by 13.5% in terms of

R^{2}

, and outperformed other deep learning models such as LSTM (

R^{2} = - 0.3756

) and GRU (

R^{2} = - 0.6276

), which failed to capture spatial patterns effectively. The hybrid model also reduced RMSE by approximately 23% compared to the standalone CNN (RMSE = 0.1666). While the moderate

R^{2}

indicates scope for further refinement, these results demonstrate that meaningful and interpretable demand forecasts can be generated from survey-based behavioural data, even in the absence of high-resolution temporal inputs. The model contributes a lightweight and scalable forecasting tool suitable for early-stage smart city planning in contexts where telemetry data are limited, thereby advancing the practical capabilities of EV infrastructure forecasting. Full article

(This article belongs to the Special Issue Sustainable and Low Carbon Development in the Energy Sector)

► Show Figures

Figure 1

22 pages, 47906 KiB

Open AccessArticle

Spatial Localization of Broadleaf Species in Mixed Forests in Northern Japan Using UAV Multi-Spectral Imagery and Mask R-CNN Model

by Nyo Me Htun, Toshiaki Owari, Satoshi N. Suzuki, Kenji Fukushi, Yuuta Ishizaki, Manato Fushimi, Yamato Unno, Ryota Konda and Satoshi Kita

Remote Sens. 2025, 17(13), 2111; https://doi.org/10.3390/rs17132111 - 20 Jun 2025

Viewed by 646

Abstract

Precise spatial localization of broadleaf species is crucial for efficient forest management and ecological studies. This study presents an advanced approach for segmenting and classifying broadleaf tree species, including Japanese oak (Quercus crispula), in mixed forests using multi-spectral imagery captured by [...] Read more.

Precise spatial localization of broadleaf species is crucial for efficient forest management and ecological studies. This study presents an advanced approach for segmenting and classifying broadleaf tree species, including Japanese oak (Quercus crispula), in mixed forests using multi-spectral imagery captured by unmanned aerial vehicles (UAVs) and deep learning. High-resolution UAV images, including RGB and NIR bands, were collected from two study sites in Hokkaido, Japan: Sub-compartment 97g in the eastern region and Sub-compartment 68E in the central region. A Mask Region-based Convolutional Neural Network (Mask R-CNN) framework was employed to recognize and classify single tree crowns based on annotated training data. The workflow incorporated UAV-derived imagery and crown annotations, supporting reliable model development and evaluation. Results showed that combining multi-spectral bands (RGB and NIR) with canopy height model (CHM) data significantly improved classification performance at both study sites. In Sub-compartment 97g, the RGB + NIR + CHM achieved a precision of 0.76, recall of 0.74, and F1-score of 0.75, compared to 0.73, 0.74, and 0.73 using RGB alone; 0.68, 0.70, and 0.66 with RGB + NIR; and 0.63, 0.67, and 0.63 with RGB + CHM. Similarly, at Sub-compartment 68E, the RGB + NIR + CHM attained a precision of 0.81, recall of 0.78, and F1-score of 0.80, outperforming RGB alone (0.79, 0.79, 0.78), RGB + NIR (0.75, 0.74, 0.72), and RGB + CHM (0.76, 0.75, 0.74). These consistent improvements across diverse forest conditions highlight the effectiveness of integrating spectral (RGB and NIR) and structural (CHM) data. These findings underscore the value of integrating UAV multi-spectral imagery with deep learning techniques for reliable, large-scale identification of tree species and forest monitoring. Full article

(This article belongs to the Special Issue Application of Remote Sensing in Forest Ecosystem Functioning and Services)

► Show Figures

Figure 1

15 pages, 3095 KiB

Open AccessArticle

A Deep Learning Method for the Automated Mapping of Archaeological Structures from Geospatial Data: A Case Study of Delos Island

by Pavlos Fylaktos, George P. Petropoulos and Ioannis Lemesios

ISPRS Int. J. Geo-Inf. 2025, 14(6), 220; https://doi.org/10.3390/ijgi14060220 - 2 Jun 2025

Viewed by 598

Abstract

The integration of artificial intelligence (AI), specifically through convolutional neural networks (CNNs), is paving the way for significant advancements in archaeological research. This study explores the innovative application of the so-called Mask Region-based convolutional neural network (Mask R-CNN) algorithm in a GIS environment, [...] Read more.

The integration of artificial intelligence (AI), specifically through convolutional neural networks (CNNs), is paving the way for significant advancements in archaeological research. This study explores the innovative application of the so-called Mask Region-based convolutional neural network (Mask R-CNN) algorithm in a GIS environment, utilizing high-resolution satellite imagery from the WorldView-3 system. By combining these state-of-the-art technologies, this study demonstrates the algorithm’s effectiveness at recognizing and segmenting the ancient structures within the archaeological site of Delos, Greece. Despite the computational constraints, the outcomes are promising, with around 25.91% of the initial vector data (434 out of 1675 polygons) successfully identified. The algorithm achieved an impressive F1 Score of 0.93% at a threshold of 0.9, indicating its high precision in differentiating specific features from their environments. This research highlights AI’s crucial role in archaeology, enabling the remote analysis of vast areas through automated or semi-automated techniques. Although these technologies cannot supplant essential on-site investigations, they can significantly enhance traditional methodologies by minimizing costs and fieldwork duration. This study also points out obstacles, such as the complexity of and variability in archaeological remains, which complicate the creation of standardized data libraries. Nevertheless, as AI technologies progress, their applications in archaeology are anticipated to broaden, fostering further innovation within the discipline. Full article

► Show Figures

Figure 1

30 pages, 8985 KiB

Open AccessArticle

Dynamic Cascade Detector for Storage Tanks and Ships in Optical Remote Sensing Images

by Tong Wang, Bingxin Liu and Peng Chen

Remote Sens. 2025, 17(11), 1882; https://doi.org/10.3390/rs17111882 - 28 May 2025

Viewed by 319

Abstract

Regional Convolutional Neural Network (RCNN)−based detectors have played a crucial role in object detection in remote sensing images due to their exceptional detection capabilities. Some studies have shown that different stages should have different Intersections of Union (IoU) thresholds to distinguish positive and [...] Read more.

Regional Convolutional Neural Network (RCNN)−based detectors have played a crucial role in object detection in remote sensing images due to their exceptional detection capabilities. Some studies have shown that different stages should have different Intersections of Union (IoU) thresholds to distinguish positive and negative samples because each stage has different IoU distributions. However, these studies have overlooked the fact that the IoU distribution at each stage changes continuously during the training process. Therefore, the IoU threshold at each stage should also be adjusted continuously to adapt to the changes in the IoU distribution. We realized that the IoU distribution at each stage is very similar to a Gaussian skewed distribution. In this paper, we introduce a novel dynamic IoU threshold method based on the Cascade RCNN architecture, called the Dynamic Cascade detector, with reference to the Gaussian skewed distribution. We tested the effectiveness of this method by detecting horizontal storage tanks and rotated ships in optical remote sensing images. Our experiments demonstrated that this technique can significantly improve detection results, as evaluated based on the COCO metric. In addition, the threshold range of the last stage impacts other stages, so the threshold range of one stage may change significantly when the number of stages changes. Furthermore, the threshold may not always increase during the training process and may decrease when the IoU distribution resembles a negatively skewed distribution. Full article

► Show Figures

Graphical abstract

33 pages, 15457 KiB

Open AccessArticle

A Hybrid Approach for Assessing Aquifer Health Using the SWAT Model, Tree-Based Classification, and Deep Learning Algorithms

by Amit Bera, Litan Dutta, Sanjit Kumar Pal, Rajwardhan Kumar, Pradeep Kumar Shukla, Wafa Saleh Alkhuraiji, Bojan Đurin and Mohamed Zhran

Water 2025, 17(10), 1546; https://doi.org/10.3390/w17101546 - 21 May 2025

Viewed by 1719

Abstract

Aquifer health assessment is essential for sustainable groundwater management, particularly in semi-arid regions with challenging geological conditions. This study presents a novel methodology for assessing aquifer health in the Barakar River Basin, a hard-rock terrain, by integrating tree-based classification, deep learning, and the [...] Read more.

Aquifer health assessment is essential for sustainable groundwater management, particularly in semi-arid regions with challenging geological conditions. This study presents a novel methodology for assessing aquifer health in the Barakar River Basin, a hard-rock terrain, by integrating tree-based classification, deep learning, and the Soil and Water Assessment Tool (SWAT) model. Employing Random Forest, Decision Tree, and Convolutional Neural Network (CNN) models, the research examines 20 influential factors, including hydrological, water quality, and socioeconomic variables, to classify aquifer health into four categories: Good, Moderately Good, Semi-Critical, and Critical. The CNN model exhibited the highest predictive accuracy, identifying 33% of the basin as having good aquifer health, while Random Forest assessed 27% as Critical heath. Pearson correlation analysis of CNN-predicted aquifer health indicates that groundwater recharge (r = 0.52), return flow (r = 0.50), and groundwater fluctuation (r = 0.48) are the most influential positive factors. Validation results showed that the CNN model performed strongly, with a precision of 0.957, Area Under the Curve–Receiver Operating Characteristic (AUC-ROC) of 0.95, and F1 score of 0.828, underscoring its reliability and robustness. Geophysical Electrical Resistivity Tomography (ERT) field surveys validated these classifications, particularly in high- and low-aquifer health zones. This study enhances understanding of aquifer dynamics and presents a robust methodology with broader applicability for sustainable groundwater management worldwide. Full article

(This article belongs to the Section Water Quality and Contamination)

► Show Figures

Graphical abstract

23 pages, 2563 KiB

Open AccessArticle

LiDAR Sensor Parameter Augmentation and Data-Driven Influence Analysis on Deep-Learning-Based People Detection

by Lukas Haas, Florian Sanne, Johann Zedelmeier, Subir Das, Thomas Zeh, Matthias Kuba, Florian Bindges, Martin Jakobi and Alexander W. Koch

Sensors 2025, 25(10), 3114; https://doi.org/10.3390/s25103114 - 14 May 2025

Viewed by 636

Abstract

Light detection and ranging (LiDAR) sensor technology for people detection offers a significant advantage in data protection. However, to design these systems cost- and energy-efficiently, the relationship between the measurement data and final object detection output with deep neural networks (DNNs) has to [...] Read more.

Light detection and ranging (LiDAR) sensor technology for people detection offers a significant advantage in data protection. However, to design these systems cost- and energy-efficiently, the relationship between the measurement data and final object detection output with deep neural networks (DNNs) has to be elaborated. Therefore, this paper presents augmentation methods to analyze the influence of the distance, resolution, noise, and shading parameters of a LiDAR sensor in real point clouds for people detection. Furthermore, their influence on object detection using DNNs was investigated. A significant reduction in the quality requirements for the point clouds was possible for the measurement setup with only minor degradation on the object list level. The DNNs PointVoxel-Region-based Convolutional Neural Network (PV-RCNN) and Sparsely Embedded Convolutional Detection (SECOND) both only show a reduction in object detection of less than 5% with a reduced resolution of up to 32 factors, for an increase in distance of 4 factors, and with a Gaussian noise up to

μ = 0

and

σ = 0.07

. In addition, both networks require an unshaded height of approx. 0.5 m from a detected person’s head downwards to ensure good people detection performance without special training for these cases. The results obtained, such as shadowing information, are transferred to a software program to determine the minimum number of sensors and their orientation based on the mounting height of the sensor, the sensor parameters, and the ground area under consideration, both for detection at the point cloud level and object detection level. Full article

(This article belongs to the Section Optical Sensors)

► Show Figures

Figure 1

15 pages, 10355 KiB

Open AccessArticle

Automated Detection and Counting of Gossypium barbadense Fruits in Peruvian Crops Using Convolutional Neural Networks

by Juan Ballena-Ruiz, Juan Arcila-Diaz and Victor Tuesta-Monteza

AgriEngineering 2025, 7(5), 152; https://doi.org/10.3390/agriengineering7050152 - 12 May 2025

Cited by 1 | Viewed by 659

Abstract

This study presents the development of a system based on convolutional neural networks for the automated detection and counting of Gossypium barbadense fruits, specifically the IPA cotton variety, during its maturation stage, known as “mota”, in crops located in the Lambayeque region of [...] Read more.

This study presents the development of a system based on convolutional neural networks for the automated detection and counting of Gossypium barbadense fruits, specifically the IPA cotton variety, during its maturation stage, known as “mota”, in crops located in the Lambayeque region of northern Peru. To achieve this, a dataset was created using images captured with a mobile device. After applying data augmentation techniques, the dataset consisted of 2186 images with 70,348 labeled fruits. Five deep learning models were trained: two variants of YOLO version 8 (nano and extra-large), two of YOLO version 11, and one based on the Faster R-CNN architecture. The dataset was split into 70% for training, 15% for validation, and 15% for testing, and all models were trained over 100 epochs with a batch size of 8. The extra-large YOLO models achieved the highest performance, with precision scores of 99.81% and 99.78%, respectively, and strong recall and F1-score values. In contrast, the nano models and Faster R-CNN showed slightly lower effectiveness. Additionally, the best-performing model was integrated into a web application developed in Python, enabling automated fruit counting from field images. The YOLO architecture emerged as an efficient and robust alternative for the automated detection of cotton fruits and stood out for its capability to process images in real time with high precision. Furthermore, its implementation in crop monitoring facilitates production estimation and decision-making in precision agriculture. Full article

(This article belongs to the Section Computer Applications and Artificial Intelligence in Agriculture)

► Show Figures

Figure 1

23 pages, 9052 KiB

Open AccessArticle

Intelligent Recognition Method for Ferrography Wear Debris Images Using Improved Mask R-CNN Methods

by Xiangwen Xiao, Weixuan Zhang, Qing Wang, Yuan Liu and Yishou Wang

Lubricants 2025, 13(5), 208; https://doi.org/10.3390/lubricants13050208 - 9 May 2025

Viewed by 561

Abstract

The accurate characterization of wear debris is crucial for assessing the health of rotating engine components and for conducting simulation experiments in debris detection. This study proposed an intelligent recognition method for ferrography wear debris images, leveraging several improved Mask Region-based Convolutional Neural [...] Read more.

The accurate characterization of wear debris is crucial for assessing the health of rotating engine components and for conducting simulation experiments in debris detection. This study proposed an intelligent recognition method for ferrography wear debris images, leveraging several improved Mask Region-based Convolutional Neural Network (Mask R-CNN) algorithms to quantitatively calculate both the number of debris particles and their coverage areas. The improvement on the Mask R-CNN focuses on two key aspects: enhancing feature extraction through the feature pyramid network structure and integrating attention mechanisms. The most suitable attention mechanism for wear debris detection was determined through ablation experiments. The improved Mask R-CNN combined with the Convolutional Block Attention Module achieves the best Mean Pixel Accuracy of 87.63% at a processing speed of 7.6 frames per second, demonstrating its high accuracy and efficiency in wear particle segmentation. Furthermore, the quantitative and qualitative analysis of wear debris, including the number and area of debris particles and their classification, provides valuable insights into the severity of wear. These insights are essential for understanding the extent of wear damage and guiding maintenance decisions. Full article

► Show Figures

Figure 1

23 pages, 25076 KiB

Open AccessArticle

Integrating DEM and Deep Learning for Forested Terrain Analysis: Enhancing Fire Risk Assessment Through Mountain Peak and Water System Extraction in Chongli District

by Yihui Wu, Xueying Sun, Liang Qi, Jiang Xu, Demin Gao and Zhengli Zhu

Forests 2025, 16(4), 692; https://doi.org/10.3390/f16040692 - 16 Apr 2025

Viewed by 600

Abstract

Accurate fire risk assessment in forested terrain is crucial for effective disaster management and ecological conservation. This study innovatively proposes a novel framework that integrates Digital Elevation Models (DEMs) with deep learning techniques to enhance fire risk assessment in Chongli District. Our framework [...] Read more.

Accurate fire risk assessment in forested terrain is crucial for effective disaster management and ecological conservation. This study innovatively proposes a novel framework that integrates Digital Elevation Models (DEMs) with deep learning techniques to enhance fire risk assessment in Chongli District. Our framework innovatively combines DEM data with Faster Regions with Convolutional Neural Networks (Faster R-CNN) and CNN-based methods, breaking through the limitations of traditional approaches that rely on manual feature extraction. It is capable of automatically identifying critical terrain features, such as mountain peaks and water systems, with higher accuracy and efficiency. DEMs provide high-resolution topographical information, which deep learning models leverage to accurately identify and delineate key geographical features. Our results show that the integration of DEMs and deep learning significantly improves the accuracy of fire risk assessment by offering detailed and precise terrain analysis, thereby providing more reliable inputs for fire behavior prediction. The extracted mountain peaks and water systems, as fundamental inputs for fire behavior prediction, enable more accurate predictions of fire spread and potential impact areas. This study not only highlights the great potential of combining geospatial data with advanced machine learning techniques but also offers a scalable and efficient solution for forest fire risk management in mountainous regions. Future work will focus on expanding the dataset to include more environmental variables and validating the model in different geographical areas to further enhance its robustness and applicability. Full article

(This article belongs to the Special Issue Fire Ecology and Management in Forest—2nd Edition)

► Show Figures

Figure 1

27 pages, 5073 KiB

Open AccessReview

A Comprehensive Review of Deep Learning in Computer Vision for Monitoring Apple Tree Growth and Fruit Production

by Meng Lv, Yi-Xiao Xu, Yu-Hang Miao and Wen-Hao Su

Sensors 2025, 25(8), 2433; https://doi.org/10.3390/s25082433 - 12 Apr 2025

Viewed by 1526

Abstract

The high nutritional and medicinal value of apples has contributed to their widespread cultivation worldwide. Unfavorable factors in the healthy growth of trees and extensive orchard work are threatening the profitability of apples. This study reviewed deep learning combined with computer vision for [...] Read more.

The high nutritional and medicinal value of apples has contributed to their widespread cultivation worldwide. Unfavorable factors in the healthy growth of trees and extensive orchard work are threatening the profitability of apples. This study reviewed deep learning combined with computer vision for monitoring apple tree growth and fruit production processes in the past seven years. Three types of deep learning models were used for real-time target recognition tasks: detection models including You Only Look Once (YOLO) and faster region-based convolutional network (Faster R-CNN); classification models including Alex network (AlexNet) and residual network (ResNet); segmentation models including segmentation network (SegNet), and mask regional convolutional neural network (Mask R-CNN). These models have been successfully applied to detect pests and diseases (located on leaves, fruits, and trunks), organ growth (including fruits, apple blossoms, and branches), yield, and post-harvest fruit defects. This study introduced deep learning and computer vision methods, outlined in the current research on these methods for apple tree growth and fruit production. The advantages and disadvantages of deep learning were discussed, and the difficulties faced and future trends were summarized. It is believed that this research is important for the construction of smart apple orchards. Full article

(This article belongs to the Section Smart Agriculture)

► Show Figures

Figure 1

25 pages, 10060 KiB

Open AccessArticle

Automated Defect Identification System in Printed Circuit Boards Using Region-Based Convolutional Neural Networks

by Kavindu Denuwan Weerakkody, Rebecca Balasundaram, Efosa Osagie and Jabir Alshehabi Al-Ani

Electronics 2025, 14(8), 1542; https://doi.org/10.3390/electronics14081542 - 10 Apr 2025

Viewed by 1118

Abstract

Printed Circuit Board (PCB) manufacturing demands accurate defect detection to ensure quality. Traditional methods, such as manual inspection or basic automated object inspection systems, are often time-consuming and inefficient. This work presents a deep learning architecture using Faster R-CNN with a ResNet-50 backbone [...] Read more.

Printed Circuit Board (PCB) manufacturing demands accurate defect detection to ensure quality. Traditional methods, such as manual inspection or basic automated object inspection systems, are often time-consuming and inefficient. This work presents a deep learning architecture using Faster R-CNN with a ResNet-50 backbone to automatically detect and classify PCB defects, including Missing Holes (MHs), Open Circuits (OCs), Mouse Bites (MBs), Shorts, Spurs, and Spurious Copper (SC). The designed architecture involves data acquisition, annotation, and augmentation to enhance model robustness. In this study, the CNN-Resnet 50 backbone achieved a precision–recall value of 87%, denoting strong and well-balanced performance in PCB fault detection and classification. The model effectively identified defective instances, reducing false negatives, which is critical for ensuring quality assurance in PCB manufacturing. Performance evaluation metrics indicated a mean average precision (mAP) of 88% and an Intersection over Union (IoU) score of 72%, signifying high prediction accuracy across various defect classes. The developed model enhances efficiency and accuracy in quality control processes, making it a promising solution for automated PCB inspection. Full article

► Show Figures

Figure 1

19 pages, 5298 KiB

Open AccessArticle

A Health Status Identification Method for Rotating Machinery Based on Multimodal Joint Representation Learning and a Residual Neural Network

by Xiangang Cao and Kexin Shi

Appl. Sci. 2025, 15(7), 4049; https://doi.org/10.3390/app15074049 - 7 Apr 2025

Viewed by 457

Abstract

Given that rotating machinery is one of the most commonly used types of mechanical equipment in industrial applications, the identification of its health status is crucial for the safe operation of the entire system. Traditional equipment health status identification mainly relies on conventional [...] Read more.

Given that rotating machinery is one of the most commonly used types of mechanical equipment in industrial applications, the identification of its health status is crucial for the safe operation of the entire system. Traditional equipment health status identification mainly relies on conventional single-modal data, such as vibration or acoustic modalities, which often have limitations and false alarm issues when dealing with real-world operating conditions and complex environments. However, with the increasing automation of coal mining equipment, the monitoring of multimodal data related to equipment operation has become more prevalent. Existing multimodal health status identification methods are still imperfect in extracting features, with poor complementarity and consistency among modalities. To address these issues, this paper proposes a multimodal joint representation learning and residual neural network-based method for rotating machinery health status identification. First, vibration, acoustic, and image modal information is comprehensively utilized, which is extracted using a Gramian Angular Field (GAF), Mel-Frequency Cepstral Coefficients (MFCCs), and a Faster Region-based Convolutional Neural Network (RCNN), respectively, to construct a feature set. Second, an orthogonal projection combined with a Transformer is used to enhance the target modality, while a modality attention mechanism is introduced to take into consideration the interaction between different modalities, enabling multimodal fusion. Finally, the fused features are input into a residual neural network (ResNet) for health status identification. Experiments conducted on a gearbox test platform validate the proposed method, and the results demonstrate that it significantly improves the accuracy and reliability of rotating machinery health state identification. Full article

► Show Figures

Figure 1

Search Results (413)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (413)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI