Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (210)

Search Parameters:
Keywords = faster region-based convolutional neural network (Faster R-CNN)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 3280 KB  
Article
A Novel Scenario-Based Comparative Framework for Short- and Medium-Term Solar PV Power Forecasting Using Deep Learning Models
by Elif Yönt Aydın, Kevser Önal, Cem Haydaroğlu, Heybet Kılıç, Özal Yıldırım, Oğuzhan Katar and Hüseyin Erdoğan
Appl. Sci. 2025, 15(24), 12965; https://doi.org/10.3390/app152412965 - 9 Dec 2025
Viewed by 364
Abstract
Accurate short- and medium-term forecasting of photovoltaic (PV) power generation is vital for grid stability and renewable energy integration. This study presents a comparative scenario-based approach using Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and Gated Recurrent Unit (GRU) models trained with [...] Read more.
Accurate short- and medium-term forecasting of photovoltaic (PV) power generation is vital for grid stability and renewable energy integration. This study presents a comparative scenario-based approach using Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and Gated Recurrent Unit (GRU) models trained with one year of real-time meteorological and production data from a 250 kWp grid-connected PV system located at Dicle University in Diyarbakır, Southeastern Anatolia, Turkey. The dataset includes hourly measurements of solar irradiance (average annual GHI 5.4 kWh/m2/day), ambient temperature, humidity, and wind speed, with missing data below 2% after preprocessing. Six forecasting scenarios were designed for different horizons (6 h to 1 month). Results indicate that the LSTM model achieved the best performance in short-term scenarios, reaching R2 values above 0.90 and lower MAE and RMSE compared to CNN and GRU. The GRU model showed similar accuracy with faster training time, while CNN produced higher errors due to the dominant temporal nature of PV output. These results align with recent studies that emphasize selecting suitable deep learning architectures for time-series energy forecasting. This work highlights the benefit of integrating real local meteorological data with deep learning models in a scenario-based design and provides practical insights for regional grid operators and energy planners to reduce production uncertainty. Future studies can improve forecast reliability by testing hybrid models and implementing real-time adaptive training strategies to better handle extreme weather fluctuations. Full article
Show Figures

Figure 1

20 pages, 8313 KB  
Article
Pipe Burst Detection and Localization in Water Distribution Networks Using Faster Region-Based Convolutional Neural Network
by Kyoungwon Min, Joong Hoon Kim, Donghwi Jung, Seungyub Lee and Doosun Kang
Water 2025, 17(23), 3380; https://doi.org/10.3390/w17233380 - 26 Nov 2025
Viewed by 504
Abstract
Pipe leakage and bursts are the primary contributors to water losses in water distribution networks (WDNs). However, the use of object detection techniques for identifying such failures is underexplored. This study proposes a novel deep-learning-based framework for pipe burst detection and localization (PBD&L) [...] Read more.
Pipe leakage and bursts are the primary contributors to water losses in water distribution networks (WDNs). However, the use of object detection techniques for identifying such failures is underexplored. This study proposes a novel deep-learning-based framework for pipe burst detection and localization (PBD&L) within WDNs. The framework employs spatial encoding of pressure fields obtained from hydraulic simulations of normal and burst scenarios. These encoded images serve as inputs to a faster region-based convolutional neural network (Faster R-CNN) object detection model, specifically designed for infrastructure monitoring. The framework was tested on three WDNs—Fossolo, PB23, and CM53—under varying sensor coverages (100%, 75%, and 50%). The results indicate that the model consistently achieves high detection accuracy across different network configurations, even with limited sensor availability. For Fossolo and PB23, the model demonstrated stable performance; however, for the CM53 network, accuracy decreased at full sensor coverage, possibly owing to overfitting or signal redundancy. Overall, the proposed method presents a robust solution for PBD&L in WDNs, showcasing significant practical applicability. Its ability to maintain high performance under partial observability and diverse network conditions demonstrates its potential for integration into real-time smart water management systems, enabling automated monitoring, rapid response, and improved operational efficiency. Full article
Show Figures

Figure 1

23 pages, 8309 KB  
Article
Hybrid Faster R-CNN for Tooth Numbering in Periapical Radiographs Based on Fédération Dentaire Internationale System
by Yong-Shao Su, I Elizabeth Cha, Yi-Cheng Mao, Li-Hsin Chang, Zi-Chun Kao, Shun-Yuan Tien, Yuan-Jin Lin, Shih-Lun Chen, Kuo-Chen Li and Patricia Angela R. Abu
Diagnostics 2025, 15(22), 2900; https://doi.org/10.3390/diagnostics15222900 - 15 Nov 2025
Viewed by 603
Abstract
Background/Objectives: Tooth numbering is essential because it allows dental clinicians to identify lesion locations during diagnosis, typically using the Fédération Dentaire Internationale system. However, accurate tooth numbering is challenging due to variations in periapical radiograph (PA) angles. In this study, we aimed to [...] Read more.
Background/Objectives: Tooth numbering is essential because it allows dental clinicians to identify lesion locations during diagnosis, typically using the Fédération Dentaire Internationale system. However, accurate tooth numbering is challenging due to variations in periapical radiograph (PA) angles. In this study, we aimed to develop a deep learning-based tool to assist dentists in accurately identifying teeth via tooth numbering and improve diagnostic efficiency and accuracy. Methods: We developed a Hybrid Faster Region-based Convolutional Neural Network (R-CNN) technique and a custom loss function tailored for PA tooth numbering to accelerate training. Additionally, we developed a tooth-numbering position auxiliary localization algorithm to address challenges associated with missing teeth and extensive crown loss in existing datasets. Results: We achieved a maximum precision of 95.16% utilizing the transformer-based NextViT-Faster R-CNN hybrid model, along with an accuracy increase of at least 8.5% and a 19.8% reduction in training time compared to models without the proposed tooth-numbering position auxiliary localization algorithm and conventional methods. Conclusions: The results demonstrate the effectiveness of the proposed method in overcoming challenges in PA tooth numbering within AI-assisted dental diagnostics, enhancing clinical efficiency, and reducing the risk of misdiagnosis in dental practices. Full article
(This article belongs to the Special Issue 3rd Edition: AI/ML-Based Medical Image Processing and Analysis)
Show Figures

Figure 1

28 pages, 12549 KB  
Article
An Enhanced Faster R-CNN for High-Throughput Winter Wheat Spike Monitoring to Improved Yield Prediction and Water Use Efficiency
by Donglin Wang, Longfei Shi, Yanbin Li, Binbin Zhang, Guangguang Yang and Serestina Viriri
Agronomy 2025, 15(10), 2388; https://doi.org/10.3390/agronomy15102388 - 14 Oct 2025
Viewed by 599
Abstract
This study develops an innovative unmanned aerial vehicle (UAV)-based intelligent system for winter wheat yield prediction, addressing the inefficiencies of traditional manual counting methods (with approximately 15% error rate) and enabling quantitative analysis of water–fertilizer interactions. By integrating an enhanced Faster Region-Based Convolutional [...] Read more.
This study develops an innovative unmanned aerial vehicle (UAV)-based intelligent system for winter wheat yield prediction, addressing the inefficiencies of traditional manual counting methods (with approximately 15% error rate) and enabling quantitative analysis of water–fertilizer interactions. By integrating an enhanced Faster Region-Based Convolutional Neural Network (Faster R-CNN) architecture with multi-source data fusion and machine learning, the system significantly improves both spike detection accuracy and yield forecasting performance. Field experiments during the 2022–2023 growing season captured high-resolution multispectral imagery for varied irrigation regimes and fertilization treatments. The optimized detection model incorporates ResNet-50 as the backbone feature extraction network, with residual connections and channel attention mechanisms, achieving a mean average precision (mAP) of 91.2% (calculated at IoU threshold 0.5) and 88.72% recall while reducing computational complexity. The model outperformed YOLOv8 by a statistically significant 2.1% margin (p < 0.05). Using model-generated spike counts as input, the random forest (RF) model regressor demonstrated superior yield prediction performance (R2 = 0.82, RMSE = 324.42 kg·ha−1), exceeding the Partial Least Squares Regression (PLSR) (R2 +46%, RMSE-44.3%), Least Squares Support Vector Machine (LSSVM) (R2 + 32.3%, RMSE-32.4%), Support Vector Regression (SVR) (R2 + 30.2%, RMSE-29.6%), and Backpropagation (BP) Neural Network (R2+22.4%, RMSE-24.4%) models. Analysis of different water–fertilizer treatments revealed that while organic fertilizer under full irrigation (750 m3 ha−1) conditions achieved maximum yield benefit (13,679.26 CNY·ha−1), it showed relatively low water productivity (WP = 7.43 kg·m−3). Conversely, under deficit irrigation (450 m3 ha−1) conditions, the 3:7 organic/inorganic fertilizer treatment achieved optimal WP (11.65 kg m−3) and WUE (20.16 kg∙ha−1∙mm−1) while increasing yield benefit by 25.46% compared to organic fertilizer alone. This research establishes an integrated technical framework for high-throughput spike monitoring and yield estimation, providing actionable insights for synergistic water–fertilizer management strategies in sustainable precision agriculture. Full article
(This article belongs to the Section Water Use and Irrigation)
Show Figures

Figure 1

14 pages, 1787 KB  
Article
HE-DMDeception: Adversarial Attack Network for 3D Object Detection Based on Human Eye and Deep Learning Model Deception
by Pin Zhang, Yawen Liu, Heng Liu, Yichao Teng, Jiazheng Ni, Zhuansun Xiaobo and Jiajia Wang
Information 2025, 16(10), 867; https://doi.org/10.3390/info16100867 - 7 Oct 2025
Viewed by 615
Abstract
This paper presents HE-DMDeception, a novel adversarial attack network that integrates human visual deception with deep model deception to enhance the security of 3D object detection. Existing patch-based and camouflage methods can mislead deep learning models but struggle to generate visually imperceptible, high-quality [...] Read more.
This paper presents HE-DMDeception, a novel adversarial attack network that integrates human visual deception with deep model deception to enhance the security of 3D object detection. Existing patch-based and camouflage methods can mislead deep learning models but struggle to generate visually imperceptible, high-quality textures. Our framework employs a CycleGAN-based camouflage network to generate highly camouflaged background textures, while a dedicated deception module disrupts non-maximum suppression (NMS) and attention mechanisms through optimized constraints that balance attack efficacy and visual fidelity. To overcome the scarcity of annotated vehicle data, an image segmentation module based on the pre-trained Segment Anything (SAM) model is introduced, leveraging a two-stage training strategy combining semi-supervised self-training and supervised fine-tuning. Experimental results show that the minimum P@0.5 values (50%, 55%, 20%, 25%, 25%) were achieved by HE-DMDeception across You Only Look Once version 8 (YOLOv8), Real-Time Detection Transformer (RT-DETR), Fast Region-based Convolutional Neural Network (Faster-RCNN), Single Shot MultiBox Detector (SSD), and MaskRegion-based Convolutional Neural Network (Mask RCNN) detection models, while maintaining high visual consistency with the original camouflage. These findings demonstrate the robustness and practicality of HE-DMDeception, offering new insights into 3D object detection adversarial attacks. Full article
Show Figures

Figure 1

18 pages, 5522 KB  
Article
Automated Detection of Methane Leaks by Combining Infrared Imaging and a Gas-Faster Region-Based Convolutional Neural Network Technique
by Jinhui Zuo, Zhengqiang Li, Wenbin Xu, Jinxin Zuo and Zhipeng Rong
Sensors 2025, 25(18), 5714; https://doi.org/10.3390/s25185714 - 12 Sep 2025
Viewed by 1440
Abstract
Gas leaks threaten ecological and social safety. Non-contact infrared imaging enables large-scale, real-time measurements; however, in complex environments, weak signals from small leaks can hinder reliable detection. This study proposes a novel automated methane leak detection method based on infrared imaging and a [...] Read more.
Gas leaks threaten ecological and social safety. Non-contact infrared imaging enables large-scale, real-time measurements; however, in complex environments, weak signals from small leaks can hinder reliable detection. This study proposes a novel automated methane leak detection method based on infrared imaging and a Gas-Faster Region-based convolutional neural network (Gas R-CNN) to classify leakage amounts (≥30 mL/min). An uncooled infrared imaging system was employed to capture gas leak images containing leak volume features. We developed the Gas R-CNN model for gas leakage detection. This model introduces a multiscale feature network to improve leak feature extraction and enhancement, and it incorporates region-of-interest alignment to address the mismatch caused by double quantization. Feature extraction was enhanced by integrating ResNet50 with an efficient channel attention mechanism. Image enhancement techniques were applied to expand the dataset diversity. Leak detection capabilities were validated using the IOD-Video dataset, while the constructed gas dataset enabled the first quantitative leak assessment. The experimental results demonstrated that the model can accurately detect the leakage area and classify leakage amounts, enabling the quantitative analysis of infrared images. The proposed method achieved average precisions of 0.9599, 0.9647, and 0.9833 for leak rates of 30, 100, and 300 mL/min, respectively. Full article
(This article belongs to the Section Optical Sensors)
Show Figures

Figure 1

27 pages, 13447 KB  
Article
Advancing Intelligent Logistics: YOLO-Based Object Detection with Modified Loss Functions for X-Ray Cargo Screening
by Jun Hao Tee, Mahmud Iwan Solihin, Kim Soon Chong, Sew Sun Tiang, Weng Yan Tham, Chun Kit Ang, Y. J. Lee, C. L. Goh and Wei Hong Lim
Future Transp. 2025, 5(3), 120; https://doi.org/10.3390/futuretransp5030120 - 8 Sep 2025
Cited by 2 | Viewed by 2787
Abstract
Efficient threat detection in X-ray cargo inspection is critical for the security of the global supply chain. This study evaluates YOLO-based object-detection models from YOLOv5 to the latest, YOLOv11, which is enhanced with modified loss functions and Soft-NMS to improve accuracy. The YOLO [...] Read more.
Efficient threat detection in X-ray cargo inspection is critical for the security of the global supply chain. This study evaluates YOLO-based object-detection models from YOLOv5 to the latest, YOLOv11, which is enhanced with modified loss functions and Soft-NMS to improve accuracy. The YOLO model comparison also includes DETR (Detection Transformer) and Faster R-CNN (Region-based Convolution Neural Network). Standard loss functions struggle with overlapping items, low contrast, and small objects in X-ray imagery. To overcome these weaknesses, IoU-based loss functions—CIoU, DIoU, GIoU, and WIoU—are integrated into the YOLO frameworks. Experiments on a dedicated cargo X-ray dataset assess precision, recall, F1-score, mAP@50, mAP@50–95, GFLOPs, and inference speed. The enhanced model, YOLOv11 with WIoU and Soft-NMS, achieves superior localization, reaching 98.44% mAP@50. This work highlights effective enhancements for YOLO models to support intelligent logistics in transportation services and automated threat detection in cargo security systems. Full article
Show Figures

Figure 1

21 pages, 9664 KB  
Article
A Detection Approach for Wheat Spike Recognition and Counting Based on UAV Images and Improved Faster R-CNN
by Donglin Wang, Longfei Shi, Huiqing Yin, Yuhan Cheng, Shaobo Liu, Siyu Wu, Guangguang Yang, Qinge Dong, Jiankun Ge and Yanbin Li
Plants 2025, 14(16), 2475; https://doi.org/10.3390/plants14162475 - 9 Aug 2025
Cited by 1 | Viewed by 957
Abstract
This study presents an innovative unmanned aerial vehicle (UAV)-based intelligent detection method utilizing an improved Faster Region-based Convolutional Neural Network (Faster R-CNN) architecture to address the inefficiency and inaccuracy inherent in manual wheat spike counting. We systematically collected a high-resolution image dataset (2000 [...] Read more.
This study presents an innovative unmanned aerial vehicle (UAV)-based intelligent detection method utilizing an improved Faster Region-based Convolutional Neural Network (Faster R-CNN) architecture to address the inefficiency and inaccuracy inherent in manual wheat spike counting. We systematically collected a high-resolution image dataset (2000 images, 4096 × 3072 pixels) covering key growth stages (heading, grain filling, and maturity) of winter wheat (Triticum aestivum L.) during 2022–2023 using a DJI M300 RTK equipped with multispectral sensors. The dataset encompasses diverse field scenarios under five fertilization treatments (organic-only, organic–inorganic 7:3 and 3:7 ratios, inorganic-only, and no fertilizer) and two irrigation regimes (full and deficit irrigation), ensuring representativeness and generalizability. For model development, we replaced conventional VGG16 with ResNet-50 as the backbone network, incorporating residual connections and channel attention mechanisms to achieve 92.1% mean average precision (mAP) while reducing parameters from 135 M to 77 M (43% decrease). The GFLOPS of the improved model has been reduced from 1.9 to 1.7, an decrease of 10.53%, and the computational efficiency of the model has been improved. Performance tests demonstrated a 15% reduction in missed detection rate compared to YOLOv8 in dense canopies, with spike count regression analysis yielding R2 = 0.88 (p < 0.05) against manual measurements and yield prediction errors below 10% for optimal treatments. To validate robustness, we established a dedicated 500-image test set (25% of total data) spanning density gradients (30–80 spikes/m2) and varying illumination conditions, maintaining >85% accuracy even under cloudy weather. Furthermore, by integrating spike recognition with agronomic parameters (e.g., grain weight), we developed a comprehensive yield estimation model achieving 93.5% accuracy under optimal water–fertilizer management (70% ETc irrigation with 3:7 organic–inorganic ratio). This work systematically addresses key technical challenges in automated spike detection through standardized data acquisition, lightweight model design, and field validation, offering significant practical value for smart agriculture development. Full article
(This article belongs to the Special Issue Plant Phenotyping and Machine Learning)
Show Figures

Figure 1

27 pages, 5740 KB  
Article
Localization of Multiple GNSS Interference Sources Based on Target Detection in C/N0 Distribution Maps
by Qidong Chen, Rui Liu, Qiuzhen Yan, Yue Xu, Yang Liu, Xiao Huang and Ying Zhang
Remote Sens. 2025, 17(15), 2627; https://doi.org/10.3390/rs17152627 - 29 Jul 2025
Viewed by 1476
Abstract
The localization of multiple interference sources in Global Navigation Satellite Systems (GNSS) can be achieved using carrier-to-noise ratio (C/N0) information provided by GNSS receivers, such as those embedded in smartphones. However, in increasingly prevalent complex scenarios—such as the coexistence of multiple [...] Read more.
The localization of multiple interference sources in Global Navigation Satellite Systems (GNSS) can be achieved using carrier-to-noise ratio (C/N0) information provided by GNSS receivers, such as those embedded in smartphones. However, in increasingly prevalent complex scenarios—such as the coexistence of multiple directional interferences, increased diversity and density of GNSS interference, and the presence of multiple low-power interference sources—conventional localization methods often fail to provide reliable results, thereby limiting their applicability in real-world environments. This paper presents a multi-interference sources localization method using object detection in GNSS C/N0 distribution maps. The proposed method first exploits the similarity between C/N0 data reported by GNSS receivers and image grayscale values to construct C/N0 distribution maps, thereby transforming the problem of multi-source GNSS interference localization into an object detection and localization task based on image processing techniques. Subsequently, an Oriented Squeeze-and-Excitation-based Faster Region-based Convolutional Neural Network (OSF-RCNN) framework is proposed to process the C/N0 distribution maps. Building upon the Faster R-CNN framework, the proposed method integrates an Oriented RPN (Region Proposal Network) to regress the orientation angles of directional antennas, effectively addressing their rotational characteristics. Additionally, the Squeeze-and-Excitation (SE) mechanism and the Feature Pyramid Network (FPN) are integrated at key stages of the network to improve sensitivity to small targets, thereby enhancing detection and localization performance for low-power interference sources. The simulation results verify the effectiveness of the proposed method in accurately localizing multiple interference sources under the increasingly prevalent complex scenarios described above. Full article
(This article belongs to the Special Issue Advanced Multi-GNSS Positioning and Its Applications in Geoscience)
Show Figures

Figure 1

15 pages, 10355 KB  
Article
Automated Detection and Counting of Gossypium barbadense Fruits in Peruvian Crops Using Convolutional Neural Networks
by Juan Ballena-Ruiz, Juan Arcila-Diaz and Victor Tuesta-Monteza
AgriEngineering 2025, 7(5), 152; https://doi.org/10.3390/agriengineering7050152 - 12 May 2025
Cited by 2 | Viewed by 1647
Abstract
This study presents the development of a system based on convolutional neural networks for the automated detection and counting of Gossypium barbadense fruits, specifically the IPA cotton variety, during its maturation stage, known as “mota”, in crops located in the Lambayeque region of [...] Read more.
This study presents the development of a system based on convolutional neural networks for the automated detection and counting of Gossypium barbadense fruits, specifically the IPA cotton variety, during its maturation stage, known as “mota”, in crops located in the Lambayeque region of northern Peru. To achieve this, a dataset was created using images captured with a mobile device. After applying data augmentation techniques, the dataset consisted of 2186 images with 70,348 labeled fruits. Five deep learning models were trained: two variants of YOLO version 8 (nano and extra-large), two of YOLO version 11, and one based on the Faster R-CNN architecture. The dataset was split into 70% for training, 15% for validation, and 15% for testing, and all models were trained over 100 epochs with a batch size of 8. The extra-large YOLO models achieved the highest performance, with precision scores of 99.81% and 99.78%, respectively, and strong recall and F1-score values. In contrast, the nano models and Faster R-CNN showed slightly lower effectiveness. Additionally, the best-performing model was integrated into a web application developed in Python, enabling automated fruit counting from field images. The YOLO architecture emerged as an efficient and robust alternative for the automated detection of cotton fruits and stood out for its capability to process images in real time with high precision. Furthermore, its implementation in crop monitoring facilitates production estimation and decision-making in precision agriculture. Full article
Show Figures

Figure 1

23 pages, 25076 KB  
Article
Integrating DEM and Deep Learning for Forested Terrain Analysis: Enhancing Fire Risk Assessment Through Mountain Peak and Water System Extraction in Chongli District
by Yihui Wu, Xueying Sun, Liang Qi, Jiang Xu, Demin Gao and Zhengli Zhu
Forests 2025, 16(4), 692; https://doi.org/10.3390/f16040692 - 16 Apr 2025
Viewed by 1347
Abstract
Accurate fire risk assessment in forested terrain is crucial for effective disaster management and ecological conservation. This study innovatively proposes a novel framework that integrates Digital Elevation Models (DEMs) with deep learning techniques to enhance fire risk assessment in Chongli District. Our framework [...] Read more.
Accurate fire risk assessment in forested terrain is crucial for effective disaster management and ecological conservation. This study innovatively proposes a novel framework that integrates Digital Elevation Models (DEMs) with deep learning techniques to enhance fire risk assessment in Chongli District. Our framework innovatively combines DEM data with Faster Regions with Convolutional Neural Networks (Faster R-CNN) and CNN-based methods, breaking through the limitations of traditional approaches that rely on manual feature extraction. It is capable of automatically identifying critical terrain features, such as mountain peaks and water systems, with higher accuracy and efficiency. DEMs provide high-resolution topographical information, which deep learning models leverage to accurately identify and delineate key geographical features. Our results show that the integration of DEMs and deep learning significantly improves the accuracy of fire risk assessment by offering detailed and precise terrain analysis, thereby providing more reliable inputs for fire behavior prediction. The extracted mountain peaks and water systems, as fundamental inputs for fire behavior prediction, enable more accurate predictions of fire spread and potential impact areas. This study not only highlights the great potential of combining geospatial data with advanced machine learning techniques but also offers a scalable and efficient solution for forest fire risk management in mountainous regions. Future work will focus on expanding the dataset to include more environmental variables and validating the model in different geographical areas to further enhance its robustness and applicability. Full article
(This article belongs to the Special Issue Fire Ecology and Management in Forest—2nd Edition)
Show Figures

Figure 1

27 pages, 5073 KB  
Review
A Comprehensive Review of Deep Learning in Computer Vision for Monitoring Apple Tree Growth and Fruit Production
by Meng Lv, Yi-Xiao Xu, Yu-Hang Miao and Wen-Hao Su
Sensors 2025, 25(8), 2433; https://doi.org/10.3390/s25082433 - 12 Apr 2025
Cited by 3 | Viewed by 5048
Abstract
The high nutritional and medicinal value of apples has contributed to their widespread cultivation worldwide. Unfavorable factors in the healthy growth of trees and extensive orchard work are threatening the profitability of apples. This study reviewed deep learning combined with computer vision for [...] Read more.
The high nutritional and medicinal value of apples has contributed to their widespread cultivation worldwide. Unfavorable factors in the healthy growth of trees and extensive orchard work are threatening the profitability of apples. This study reviewed deep learning combined with computer vision for monitoring apple tree growth and fruit production processes in the past seven years. Three types of deep learning models were used for real-time target recognition tasks: detection models including You Only Look Once (YOLO) and faster region-based convolutional network (Faster R-CNN); classification models including Alex network (AlexNet) and residual network (ResNet); segmentation models including segmentation network (SegNet), and mask regional convolutional neural network (Mask R-CNN). These models have been successfully applied to detect pests and diseases (located on leaves, fruits, and trunks), organ growth (including fruits, apple blossoms, and branches), yield, and post-harvest fruit defects. This study introduced deep learning and computer vision methods, outlined in the current research on these methods for apple tree growth and fruit production. The advantages and disadvantages of deep learning were discussed, and the difficulties faced and future trends were summarized. It is believed that this research is important for the construction of smart apple orchards. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

19 pages, 5298 KB  
Article
A Health Status Identification Method for Rotating Machinery Based on Multimodal Joint Representation Learning and a Residual Neural Network
by Xiangang Cao and Kexin Shi
Appl. Sci. 2025, 15(7), 4049; https://doi.org/10.3390/app15074049 - 7 Apr 2025
Cited by 2 | Viewed by 923
Abstract
Given that rotating machinery is one of the most commonly used types of mechanical equipment in industrial applications, the identification of its health status is crucial for the safe operation of the entire system. Traditional equipment health status identification mainly relies on conventional [...] Read more.
Given that rotating machinery is one of the most commonly used types of mechanical equipment in industrial applications, the identification of its health status is crucial for the safe operation of the entire system. Traditional equipment health status identification mainly relies on conventional single-modal data, such as vibration or acoustic modalities, which often have limitations and false alarm issues when dealing with real-world operating conditions and complex environments. However, with the increasing automation of coal mining equipment, the monitoring of multimodal data related to equipment operation has become more prevalent. Existing multimodal health status identification methods are still imperfect in extracting features, with poor complementarity and consistency among modalities. To address these issues, this paper proposes a multimodal joint representation learning and residual neural network-based method for rotating machinery health status identification. First, vibration, acoustic, and image modal information is comprehensively utilized, which is extracted using a Gramian Angular Field (GAF), Mel-Frequency Cepstral Coefficients (MFCCs), and a Faster Region-based Convolutional Neural Network (RCNN), respectively, to construct a feature set. Second, an orthogonal projection combined with a Transformer is used to enhance the target modality, while a modality attention mechanism is introduced to take into consideration the interaction between different modalities, enabling multimodal fusion. Finally, the fused features are input into a residual neural network (ResNet) for health status identification. Experiments conducted on a gearbox test platform validate the proposed method, and the results demonstrate that it significantly improves the accuracy and reliability of rotating machinery health state identification. Full article
Show Figures

Figure 1

18 pages, 3958 KB  
Article
AI-Driven UAV Surveillance for Agricultural Fire Safety
by Akmalbek Abdusalomov, Sabina Umirzakova, Komil Tashev, Nodir Egamberdiev, Guzalxon Belalova, Azizjon Meliboev, Ibragim Atadjanov, Zavqiddin Temirov and Young Im Cho
Fire 2025, 8(4), 142; https://doi.org/10.3390/fire8040142 - 2 Apr 2025
Cited by 6 | Viewed by 1969
Abstract
The increasing frequency and severity of agricultural fires pose significant threats to food security, economic stability, and environmental sustainability. Traditional fire-detection methods, relying on satellite imagery and ground-based sensors, often suffer from delayed response times and high false-positive rates, limiting their effectiveness in [...] Read more.
The increasing frequency and severity of agricultural fires pose significant threats to food security, economic stability, and environmental sustainability. Traditional fire-detection methods, relying on satellite imagery and ground-based sensors, often suffer from delayed response times and high false-positive rates, limiting their effectiveness in mitigating fire-related damages. In this study, we propose an advanced deep learning-based fire-detection framework that integrates the Single-Shot MultiBox Detector (SSD) with the computationally efficient MobileNetV2 architecture. This integration enhances real-time fire- and smoke-detection capabilities while maintaining a lightweight and deployable model suitable for Unmanned Aerial Vehicle (UAV)-based agricultural monitoring. The proposed model was trained and evaluated on a custom dataset comprising diverse fire scenarios, including various environmental conditions and fire intensities. Comprehensive experiments and comparative analyses against state-of-the-art object-detection models, such as You Only Look Once (YOLO), Faster Region-based Convolutional Neural Network (Faster R-CNN), and SSD-based variants, demonstrated the superior performance of our model. The results indicate that our approach achieves a mean Average Precision (mAP) of 97.7%, significantly surpassing conventional models while maintaining a detection speed of 45 frames per second (fps) and requiring only 5.0 GFLOPs of computational power. These characteristics make it particularly suitable for deployment in edge-computing environments, such as UAVs and remote agricultural monitoring systems. Full article
Show Figures

Figure 1

39 pages, 13137 KB  
Article
Neural Network-Based Emotion Classification in Medical Robotics: Anticipating Enhanced Human–Robot Interaction in Healthcare
by Waqar Riaz, Jiancheng (Charles) Ji, Khalid Zaman and Gan Zengkang
Electronics 2025, 14(7), 1320; https://doi.org/10.3390/electronics14071320 - 27 Mar 2025
Cited by 3 | Viewed by 1258
Abstract
This study advances artificial intelligence by pioneering the classification of human emotions (for patients) with a healthcare mobile robot, anticipating human–robot interaction for humans (patients) admitted in hospitals or any healthcare environment. This study delves into the challenge of accurately classifying humans emotion [...] Read more.
This study advances artificial intelligence by pioneering the classification of human emotions (for patients) with a healthcare mobile robot, anticipating human–robot interaction for humans (patients) admitted in hospitals or any healthcare environment. This study delves into the challenge of accurately classifying humans emotion as a patient emotion, which is a critical factor in understanding patients’ recent moods and situations. We integrate convolutional neural networks (CNNs), recurrent neural networks (RNNs), and multi-layer perceptrons (MLPs) to analyze facial emotions comprehensively. The process begins by deploying a faster region-based convolutional neural network (Faster R-CNN) to swiftly and accurately identify human emotions in real-time and recorded video feeds. This includes advanced feature extraction across three CNN models and innovative fusion techniques, which strengthen the improved Inception-V3 for superior accuracy and replace the improved Faster R-CNN feature learning module. This valuable replacement aims to enhance the accuracy of face detection in our proposed framework. Carefully acquired these datasets in a simulated environment. Validation on the EMOTIC, CK+, FER-2013, and AffectNet datasets all showed impressive accuracy rates of 98.01%, 99.53%, 99.27%, and 96.81%, respectively. These class-wise accuracy rates show that it has the potential to advance the medical environment and measures in the intelligent manufacturing of healthcare mobile robots. Full article
(This article belongs to the Special Issue New Advances of Brain-Computer and Human-Robot Interaction)
Show Figures

Figure 1

Back to TopTop