Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (693)

Search Parameters:
Keywords = Mask R-CNN

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
15 pages, 1404 KB  
Article
A Deep Learning-Based Decision Support System for Cholelithiasis in MRI Data
by Ebru Hasbay, Caglar Cengizler, Mahmut Ucar, Nagihan Durgun, Hayriye Ulkucan Disli and Deniz Bolat
J. Clin. Med. 2026, 15(5), 1891; https://doi.org/10.3390/jcm15051891 - 2 Mar 2026
Viewed by 166
Abstract
Background: Cholelithiasis can lead to significant complications if not diagnosed and treated promptly. Recent advances in deep learning and the improved ability of computer systems to detect clinically significant textural and morphological patterns in magnetic resonance imaging (MRI) can help reduce the time [...] Read more.
Background: Cholelithiasis can lead to significant complications if not diagnosed and treated promptly. Recent advances in deep learning and the improved ability of computer systems to detect clinically significant textural and morphological patterns in magnetic resonance imaging (MRI) can help reduce the time and resources required for the radiological evaluation of the gallbladder and cholelithiasis. Objective: To detect cholelithiasis, a support system with a graphical user interface for magnetic resonance (MR) images of the gallbladder was implemented to reduce the manual effort and time required to identify gallstones. Method: A commonly used deep learning model for pixel-level mask generation and instance segmentation, Mask Region Based Convolutional Neural Network (Mask R-CNN), was modified, trained, and evaluated to provide a robust pipeline for automated analysis. The primary aim was to automatically locate and label the gallbladder in T2-weighted axial MR images to detect gallstones and highlight the visual characteristics of the target region, thereby supporting radiologists. All automation was designed to operate on a single optimal slice instead of the entire volume. While this approach limits generalisability, it offers a practical starting point for method development. This setup reflects a feasibility-oriented design, rather than a comprehensive diagnostic capability. The dataset included 788 axial MR images from different patients. Each image was labeled and segmented by an experienced radiologist to train and test the models at the image level. Results: The proposed model with squeeze and excitation (SE) modification improved classification accuracy, and at the image level, stone detection improved in terms of accuracy, precision, and specificity, although recall and F1 scores slightly decreased. Conclusions: The results show that the modified Mask R-CNN model can detect gallstones with up to 0.89 accuracy, supporting the clinical applicability of the proposed method. Full article
(This article belongs to the Topic Machine Learning and Deep Learning in Medical Imaging)
Show Figures

Figure 1

30 pages, 3196 KB  
Systematic Review
Deep Learning-Based Dental Caries Diagnosis: A Modality-Stratified Systematic Review and Meta-Analysis of Faster R-CNN and Mask R-CNN
by Quang Tuan Lam, Minh Huu Nhat Le, Fang-Yu Fan, Nguyen Quoc Khanh Le and I-Ta Lee
Diagnostics 2026, 16(5), 731; https://doi.org/10.3390/diagnostics16050731 - 1 Mar 2026
Viewed by 398
Abstract
Background: Deep convolutional neural networks (DCNNs) are increasingly used in computer-aided dental diagnostics. However, the relative diagnostic performance of commonly applied architectures, particularly Faster R-CNN and Mask R-CNN, has not been systematically synthesized across imaging modalities. This systematic review and meta-analysis compared the [...] Read more.
Background: Deep convolutional neural networks (DCNNs) are increasingly used in computer-aided dental diagnostics. However, the relative diagnostic performance of commonly applied architectures, particularly Faster R-CNN and Mask R-CNN, has not been systematically synthesized across imaging modalities. This systematic review and meta-analysis compared the diagnostic accuracy of Faster R-CNN and Mask R-CNN for dental caries detection using radiographic and photographic images. Methods: PubMed (MEDLINE), EMBASE, Web of Science, and Scopus were systematically searched for studies published up to 15 June 2025. Studies applying Faster R-CNN and/or Mask R-CNN to dental caries detection were included. Binary diagnostic data were extracted, and pooled sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) were estimated using a bivariate random-effects model. Study quality was assessed with QUADAS-AI, and radiomics-based radiographic studies were additionally evaluated using the Radiomics Quality Score (RQS). The protocol was registered in PROSPERO (CRD420251074443). Results: Seventeen studies met the inclusion criteria. Across all imaging modalities, Mask R-CNN showed significantly higher pooled sensitivity (85.6% vs. 71.7%, p = 0.0244), specificity (94.2% vs. 81.4%, p = 0.00089), and AUC (0.95 vs. 0.84, p = 0.0053) than Faster R-CNN. In radiographic images, Mask R-CNN consistently outperformed Faster R-CNN in sensitivity (86.3% vs. 67.2%, p = 0.0497), specificity (96.5% vs. 85.0%, p = 0.00105), and AUC (0.97 vs. 0.86, p = 0.0067). In photographic images, Mask R-CNN achieved a higher AUC (0.91 vs. 0.83, p = 0.048), whereas differences in pooled sensitivity (83.5% vs. 77.3%, p = 0.435) and specificity (86.0% vs. 75.1%, p = 0.156) were not statistically significant. Conclusions: Faster R-CNN and Mask R-CNN both show potential for dental caries detection, but current evidence is limited by substantial heterogeneity, predominantly retrospective designs, and variability in imaging and labeling. Across the included studies, Mask R-CNN showed higher pooled performance estimates than Faster R-CNN, with the clearest differences in radiographic applications; however, this comparison is indirect and should be considered suggestive rather than definitive given study-level heterogeneity and uncertainty in the reference standard in a sizable proportion of studies. Prospective, multi-center studies with standardized imaging protocols, rigorous annotation, and independent external validation are required to support reliable clinical implementation. Full article
(This article belongs to the Special Issue Advances in Dental Diagnostics)
Show Figures

Figure 1

18 pages, 4072 KB  
Article
Classification and Contour Recognition of Welding Defects in Magneto-Optical Images
by Nvjie Ma, Guoying Zhang, Huazhuo Liang, Shichao Gu, Congyi Wang, Yanxi Zhang and Xiangdong Gao
Metals 2026, 16(3), 267; https://doi.org/10.3390/met16030267 - 28 Feb 2026
Viewed by 151
Abstract
In the field of magneto-optical imaging nondestructive testing for welding defects, multi-angle detection of welding defects has already been achieved. However, research on automatic defect recognition and contour extraction remains insufficient. Therefore, to enable automatic detection of welding defects using magneto-optical imaging technology, [...] Read more.
In the field of magneto-optical imaging nondestructive testing for welding defects, multi-angle detection of welding defects has already been achieved. However, research on automatic defect recognition and contour extraction remains insufficient. Therefore, to enable automatic detection of welding defects using magneto-optical imaging technology, it is essential to address the key issues of defect recognition and contour extraction in magneto-optical images. The dataset in this article includes five types of images: defect-free, lack-of-fusion, cracks, pits, and Weld reinforcement. Firstly, the Mask R-CNN detection method is used to perform defect recognition and contour segmentation on the original magneto-optical image dataset. The detection results indicate that the recognition rate of lack-of-fusion and Weld reinforcement in the original magneto-optical image is not high, and the recognition accuracy of pits and cracks is extremely low. Subsequently, the magneto-optical image dataset was preprocessed using the differential level set method, and the mask R-CNN algorithm was used to identify defect types and segment defect contours. Comparing the results of two experiments, it was found that the detection accuracy of the preprocessed dataset was higher, and the overall recognition accuracy increased by 30%. Full article
Show Figures

Figure 1

30 pages, 15102 KB  
Article
FireVision: An Early Fire and Smoke Detection Platform Utilizing Mask R-CNN Deep Learning Inferences
by Konstantina Spanoudaki, Meropi Tsoumani, Sotirios Kontogiannis, Myrto Konstantinidou, Ion Anastasios Karolos and George Kokkonis
Algorithms 2026, 19(3), 169; https://doi.org/10.3390/a19030169 - 24 Feb 2026
Viewed by 192
Abstract
This paper presents FireVision, an innovative platform and model for real-time fire detection and monitoring. The platform utilizes automated drone flights to collect high-resolution imagery in both suburban and forested settings. Ensemble deep learning inference, based on Mask R-CNN weak learners, is employed [...] Read more.
This paper presents FireVision, an innovative platform and model for real-time fire detection and monitoring. The platform utilizes automated drone flights to collect high-resolution imagery in both suburban and forested settings. Ensemble deep learning inference, based on Mask R-CNN weak learners, is employed to trigger alerts. Detection performance is further enhanced by integrating ResNet-50, ResNet-101, and ResNet-152 classifiers, which can be deployed in the cloud or on the drone’s edge co-processing units. Additionally, a fire criticality index is introduced, leveraging detection bounds and masks to assess the severity of fire events, alongside an automated drone path-planning algorithm for identifying critical fire incidents. Experiments were conducted using a supervised, mask-annotated dataset to evaluate model accuracy and inference speed across various cloud and edge computing configurations. Results indicate that ResNet-101 surpasses ResNet-50 by 5 to 12.5 percent in mAP@0.5 mask accuracy, with an 18 percent increase in inference time on the cloud and a 27 percent increase on the drone edge device GPU. In comparison, ResNet-152 achieves a 0.5 to 1.2 percent improvement in mAP@0.5 over ResNet-101, but its inference time is nine times slower in the cloud and 1.3 times slower on the GPU. Full article
(This article belongs to the Special Issue AI Applications and Modern Industry)
Show Figures

Figure 1

22 pages, 39829 KB  
Article
Dual-Detector Vision and Depth-Aware Back-Projection for Accurate Apple Detection and 3D Localisation for Robotic Harvesting
by Tagor Hossain, Peng Shi and Levente Kovacs
Robotics 2026, 15(2), 47; https://doi.org/10.3390/robotics15020047 - 22 Feb 2026
Viewed by 276
Abstract
Accurate apple detection and precise three-dimensional (3D) localisation are essential for autonomous robotic harvesting in orchard environments, where occlusion, illumination variation, depth noise, and the similar colour appearance of fruits and surrounding leaves present significant challenges. This paper proposes a dual-detector vision framework [...] Read more.
Accurate apple detection and precise three-dimensional (3D) localisation are essential for autonomous robotic harvesting in orchard environments, where occlusion, illumination variation, depth noise, and the similar colour appearance of fruits and surrounding leaves present significant challenges. This paper proposes a dual-detector vision framework combined with depth-aware back-projection to achieve robust apple detection and metric 3D localisation in real time. The method integrates the complementary strengths of YOLOv8 and Mask R-CNN through confidence-weighted fusion of bounding boxes and pixel-wise union of segmentation masks, producing stabilised two-dimensional (2D) apple representations under visually ambiguous conditions. The fusion results are converted into dense 3D representations through depth-guided projection within the camera coordinate system representing the visible fruit surface. A depth-consistency weighting strategy assigns higher influence to depth-reliable pixels during centroid computation, thereby suppressing noisy or occluded depth measurements and improving the stability of 3D fruit centre estimation, while local intensity normalisation standardises neighbourhood-level pixel intensities to reduce the impact of shadows, highlights, and uneven lighting, enabling more consistent segmentation and detection across varying illumination conditions. Experimental results demonstrate an accuracy of 98.9%, an mAP of 94.2%, an F1-score of 93.3%, and a recall of 92.8%, while achieving real-time performance at 86.42 FPS, confirming the suitability of the proposed method for robotic harvesting in challenging orchard environments. Full article
(This article belongs to the Special Issue Perception and AI for Field Robotics)
Show Figures

Figure 1

14 pages, 6614 KB  
Article
Watershed YOLO: Method for Ordered Recognition of Microwave Photonic Radar Scatter Points Based on YOLO and Peak-Constrained Watershed Algorithm
by Chunyang Liu, Zhilei Hu, Tian Gao, Xin Sui, Kunning Ji, Ye Tong, Yan Huang and Nan Guo
Electronics 2026, 15(4), 811; https://doi.org/10.3390/electronics15040811 - 13 Feb 2026
Viewed by 190
Abstract
To address the challenge of achieving high-precision and ordered calibration of strong-scatter points in inverse synthetic aperture radar (ISAR) images, this paper proposes a collaborative framework that integrates YOLOv12-pose with Peak-Constrained Watershed (PCW). The method first employs the YOLOv12-pose model to produce an [...] Read more.
To address the challenge of achieving high-precision and ordered calibration of strong-scatter points in inverse synthetic aperture radar (ISAR) images, this paper proposes a collaborative framework that integrates YOLOv12-pose with Peak-Constrained Watershed (PCW). The method first employs the YOLOv12-pose model to produce an initial localization of scatter points. PCW is then applied to fine-segment individual points. Finally, a three-stage global optimal matching strategy is introduced to achieve high-precision fusion between index labels and their geometric positions. Experimental results on a microwave photonic radar ISAR dataset demonstrated that the proposed method achieved an average error of 1.89 pixels, with accuracy, recall, and F1 scores exceeding 95%. The approach significantly outperformed standalone YOLO, Mask R-CNN, and traditional SVM-based methods while maintaining label consistency and substantially improving precision and robustness for the recognition, localization, and tracking of strong scatter points in ISAR imagery. Full article
Show Figures

Figure 1

17 pages, 2972 KB  
Article
A Deep Learning-Based Method for Non-Destructive Estimation of Carbonate Carbon Storage in Biogenic Shells on Marine Engineering Materials
by Haonan Huang, Mengting Jia, Qiang Xu, Zhiqiang Cui and Junyu He
Materials 2026, 19(4), 691; https://doi.org/10.3390/ma19040691 - 11 Feb 2026
Viewed by 231
Abstract
Hard-shelled organisms colonizing marine engineering surfaces accumulate carbonate inorganic carbon in their shells, yet quantification typically relies on destructive sampling, hindering long-term monitoring. This study develops a deep learning-based, non-destructive framework to estimate shell carbonate carbon storage from in situ images. Panels of [...] Read more.
Hard-shelled organisms colonizing marine engineering surfaces accumulate carbonate inorganic carbon in their shells, yet quantification typically relies on destructive sampling, hindering long-term monitoring. This study develops a deep learning-based, non-destructive framework to estimate shell carbonate carbon storage from in situ images. Panels of different surface materials were deployed in the nearshore waters of Liuheng Island (Zhoushan) and monitored for five months, yielding 90 panel images from June to October. An improved Mask R-CNN identified barnacles and bivalves and extracted shell dimensions, which were combined with allometric relationships and measured shell carbonate carbon fractions (12.07% for barnacles; 12.14% for bivalves) to estimate carbon storage. Peak colonization occurred on uncoated polyvinyl chloride (PVC) panels in September (~110 individuals per panel), corresponding to 1.061 g carbonate carbon per panel. The model achieved recall/precision of 0.86/0.89 under complex nearshore conditions; image-derived dimensions agreed with manual measurements (R2 = 0.95). Allometric models showed R2 of 0.82 (barnacles) and 0.90 (bivalves), and panel-scale estimation errors were <15%. The method enables non-destructive quantitative characterization and comparison of shell carbonate carbon storage across materials and exposure conditions for long-term monitoring. Full article
(This article belongs to the Section Green Materials)
Show Figures

Figure 1

17 pages, 2032 KB  
Article
AI-Based Pulmonary Embolism Detection: The Added Value of a False-Positive Reduction Module over a Region Proposal Network
by Jeong Sub Lee, Euijin Hwang, Changgyun Jin, Kyong Joon Lee, Ye Ra Choi and Sang Il Choi
Diagnostics 2026, 16(4), 524; https://doi.org/10.3390/diagnostics16040524 - 9 Feb 2026
Viewed by 350
Abstract
Background: High false-positive rates remain a significant challenge in the automated detection of pulmonary embolism (PE) using Computed Tomography Pulmonary Angiography (CTPA). This study evaluated the additional value of a False-Positive Reduction (FPR) module integrated into a Region Proposal Network (RPN). Methods [...] Read more.
Background: High false-positive rates remain a significant challenge in the automated detection of pulmonary embolism (PE) using Computed Tomography Pulmonary Angiography (CTPA). This study evaluated the additional value of a False-Positive Reduction (FPR) module integrated into a Region Proposal Network (RPN). Methods: A retrospective analysis of 303 CTPA scans (163 PE-positive and 140 PE-negative) was conducted from a single tertiary institution. Both models were additionally validated on an independent external cohort of 100 CTPA scans (50 PE-positive and 50 PE-negative) from the RSNA PE Challenge dataset. The diagnostic performance of the one-stage RPN-only model was compared with that of a two-stage Modified Mask R-CNN (Region-based Convolutional Neural Network) incorporating the FPR module. Results: The Modified Mask R-CNN exhibited significant improvement in terms of specificity. The false-positive rate per scan decreased by 31% in comparison to the RPN-only model. Although there was a slight reduction in patient-level sensitivity, the Positive Predictive Value significantly increased by 10.5%. Additionally, patient-level specificity for emboli with a volume ≥ 1000 mm3 increased, reflecting a 7.4% relative improvement in detecting clinically significant emboli. Conclusions: The Modified Mask R-CNN significantly reduced false positives while maintaining high sensitivity over a region proposal network. Full article
Show Figures

Graphical abstract

24 pages, 3288 KB  
Article
Multi-Task Deep Learning for Lung Nodule Detection and Segmentation in CT Scans
by Runhan Li and Barmak Honarvar Shakibaei Asli
Electronics 2026, 15(4), 736; https://doi.org/10.3390/electronics15040736 - 9 Feb 2026
Viewed by 312
Abstract
The early detection of pulmonary nodules in chest CT scans is critical for improving lung cancer outcomes. While existing computer-aided diagnosis (CAD) systems have shown promise, most treat detection and segmentation as separate tasks, leading to fragmented pipelines and limited representation sharing. This [...] Read more.
The early detection of pulmonary nodules in chest CT scans is critical for improving lung cancer outcomes. While existing computer-aided diagnosis (CAD) systems have shown promise, most treat detection and segmentation as separate tasks, leading to fragmented pipelines and limited representation sharing. This study proposes a 2.5D multi-task learning (MTL) framework that integrates both tasks within a unified Mask R-CNN architecture. The framework incorporates a tailored preprocessing pipeline—including Hounsfield Unit (HU) normalisation, CLAHE enhancement, and lung parenchyma masking—to improve input consistency and task-relevant contrast characteristics. To enhance sensitivity for small or ambiguous nodules, an auxiliary RoI classifier is introduced. Additionally, a nodule-level evaluation strategy aggregates slice-wise predictions across the z-axis, supporting a clinically meaningful assessment that approximates 3D diagnostic workflows. Experiments on the LUNA16 dataset demonstrate that the proposed framework achieves a favourable trade-off between detection and segmentation performance under a unified 2.5D multi-task setting. These results highlight the potential of integrated MTL approaches to advance CAD systems for early lung cancer screening. Full article
(This article belongs to the Special Issue Deep Learning for Computer Vision Application: Second Edition)
Show Figures

Figure 1

15 pages, 6963 KB  
Article
Nondestructive Detection of Early Subsurface Bruises in Fragrant Pears Using Structured-Illumination Reflectance Imaging and Mask R-CNN
by Baishao Zhan, Zhangwei Guo, Qicheng Li, Wei Luo, Jicong Chen and Hailiang Zhang
Spectrosc. J. 2026, 4(1), 4; https://doi.org/10.3390/spectroscj4010004 - 6 Feb 2026
Viewed by 194
Abstract
To achieve accurate identification of early subcutaneous bruising regions in fragrant pears, this study developed a detection system based on Structured-Illumination Reflectance Imaging (SIRI) and integrated it with both machine learning and deep learning models. Structured-illumination images were acquired at six spatial frequencies [...] Read more.
To achieve accurate identification of early subcutaneous bruising regions in fragrant pears, this study developed a detection system based on Structured-Illumination Reflectance Imaging (SIRI) and integrated it with both machine learning and deep learning models. Structured-illumination images were acquired at six spatial frequencies (50, 100, 150, 200, 250, and 300 cycle·m−1) and evaluated after demodulation through both visual assessment and contrast index (CI) analysis. The optimal spatial frequency of 150 cycle·m−1 was selected for subsequent analysis. Texture features were extracted from AC, DC, and RT images based on the gray-level co-occurrence matrix (GLCM), and classification was performed using three machine learning models KNN, PLS-DA, LightGBM and the deep learning Mask R-CNN model. The results showed that the classification performance of RT images was superior to that of AC and DC images. Among them, the PLS-DA model achieved an accuracy of 95.00% on the test set for RT images. The Mask R-CNN model achieved a recognition accuracy of 99.17% on the RT image test set. These results demonstrate that the combination of SIRI and deep learning enables highly sensitive and nondestructive detection of early subcutaneous bruising in Korla pears, providing an efficient and reliable technical approach for fruit quality grading and postharvest intelligent inspection. Full article
Show Figures

Figure 1

30 pages, 5076 KB  
Article
Building Footprint Extraction for Large-Scale Basemaps Using Very-High-Resolution Satellite Imagery
by Yofri Furqani Hakim and Fuan Tsai
Buildings 2026, 16(3), 675; https://doi.org/10.3390/buildings16030675 - 6 Feb 2026
Viewed by 306
Abstract
Accurate building footprint is a fundamental element of large-scale base maps, which serve as critical inputs for urban planning, infrastructure development, environmental monitoring, and disaster management. While building footprint extraction and geometric regularization have been widely studied, their combined application for automated, large-scale [...] Read more.
Accurate building footprint is a fundamental element of large-scale base maps, which serve as critical inputs for urban planning, infrastructure development, environmental monitoring, and disaster management. While building footprint extraction and geometric regularization have been widely studied, their combined application for automated, large-scale basemap generation using very-high-resolution satellite imagery has received limited attention. To address this gap, this study proposes an integrated framework that leverages deep learning and geometric regularization to efficiently extract and refine building footprints for large-scale base maps. The framework first enhances spectral, spatial, and textural features of very-high-resolution satellite imagery through pan-sharpening, NDVI computation, GLCM-based texture analysis, and PCA. A Mask R-CNN model is then trained on multi-band imagery to segment building footprints, followed by geometric regularization to simplify and align polygons along dominant structural orientations. Object-based evaluation on ground-truth buildings demonstrates high performance, with 97.6% precision, 91.6% recall, and a 94.5% F1-score. The proposed systematic framework substantially reduces production time compared to manual stereo-plotting, requiring less than an hour per 5.29 km2 map sheet in operational production, representing a more than 35-fold efficiency gain. While minor geometric inaccuracies and merged adjacent buildings persist, the methodology offers a robust, scalable, and efficient approach to support large-scale base map production. Full article
(This article belongs to the Section Construction Management, and Computers & Digitization)
Show Figures

Figure 1

19 pages, 4140 KB  
Article
Bamboo Forest Area Extraction and Clump Identification Using Semantic Segmentation and Instance Segmentation Models
by Keng-Hao Liu, Shih-Ji Lin, Che-Wei Hu and Chinsu Lin
Forests 2026, 17(2), 191; https://doi.org/10.3390/f17020191 - 1 Feb 2026
Viewed by 211
Abstract
This study addresses the need for effective bamboo monitoring in smart forestry as UAV imagery and AI-based methods continue to advance. Bambusa stenostachya (thorny bamboo), commonly found in the badland regions of southern Taiwan, spreads rapidly due to its strong reproductive capacity and [...] Read more.
This study addresses the need for effective bamboo monitoring in smart forestry as UAV imagery and AI-based methods continue to advance. Bambusa stenostachya (thorny bamboo), commonly found in the badland regions of southern Taiwan, spreads rapidly due to its strong reproductive capacity and extensive rhizome system, often causing forestland degradation and challenges to sustainable management. An automated detection approach is therefore required to capture bamboo dynamics and support forest resource assessment. We use a dual-component framework for detecting bamboo forests and individual bamboo clumps from high-resolution UAV orthomosaic imagery. The first component performs semantic segmentation using U-Net or SegFormer to extract bamboo forest areas and generate a corresponding forest mask. The second component independently applies instance segmentation using YOLOv8-Seg and Mask R-CNN to delineate and localize individual bamboo clumps. The dataset was collected from Compartment 43 of the Qishan Working Circle in Kaohsiung, Taiwan. Experimental results show strong model performance: bamboo forest segmentation achieved an F1-score of 0.9569, while bamboo clump instance segmentation reached a precision of 0.8232. These findings demonstrate the promising potential of deep learning-based segmentation techniques for improving bamboo detection and supporting operational forest monitoring. Full article
(This article belongs to the Special Issue Application of Machine-Learning Methods in Forestry)
Show Figures

Figure 1

29 pages, 1843 KB  
Systematic Review
Deep Learning for Tree Crown Detection and Delineation Using UAV and High-Resolution Imagery for Biometric Parameter Extraction: A Systematic Review
by Abdulrahman Sufyan Taha Mohammed Aldaeri, Chan Yee Kit, Lim Sin Ting and Mohamad Razmil Bin Abdul Rahman
Forests 2026, 17(2), 179; https://doi.org/10.3390/f17020179 - 29 Jan 2026
Viewed by 449
Abstract
Mapping individual-tree crowns (ITCs) along with extracting tree morphological attributes provides the core parameters required for estimating thermal stress and carbon emission functions. However, calculating morphological attributes relies on the prior delineation of ITCs. Using the Preferred Reporting Items for Systematic Reviews and [...] Read more.
Mapping individual-tree crowns (ITCs) along with extracting tree morphological attributes provides the core parameters required for estimating thermal stress and carbon emission functions. However, calculating morphological attributes relies on the prior delineation of ITCs. Using the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) framework, this review synthesizes how deep-learning (DL)-based methods enable the conversion of crown geometry into reliable biometric parameter extraction (BPE) from high-resolution imagery. This addresses a gap often overlooked in studies focused solely on detection by providing a direct link to forest inventory metrics. Our review showed that instance segmentation dominates (approximately 46% of studies), producing the most accurate pixel-level masks for BPE, while RGB imagery is most common (73%), often integrated with canopy-height models (CHM) to enhance accuracy. New architectural approaches, such as StarDist, outperform Mask R-CNN by 6% in dense canopies. However, performance differs with crown overlap, occlusion, species diversity, and the poor transferability of allometric equations. Future work could prioritize multisensor data fusion, develop end-to-end biomass modeling to minimize allometric dependence, develop open datasets to address model generalizability, and enhance and test models like StarDist for higher accuracy in dense forests. Full article
Show Figures

Figure 1

21 pages, 2960 KB  
Article
Defect Generation and Detection Strategy for Tempered Glass in Sample-Scarce Scenarios
by Kai Hou, Jing-Fang Yang, Peng Zhang, Guang-Chun Xiao, Fei Wang, Run-Ze Fan and Xiang-Feng Liu
Information 2026, 17(2), 122; https://doi.org/10.3390/info17020122 - 28 Jan 2026
Viewed by 274
Abstract
To address the challenge of defect detection in tempered glass panel production rising from sample scarcity, this paper proposes a few-shot detection methodology that integrates an enhanced Stable Diffusion model with Mask R-CNN. Specifically, the approach utilizes a Mask Encoder to optimize the [...] Read more.
To address the challenge of defect detection in tempered glass panel production rising from sample scarcity, this paper proposes a few-shot detection methodology that integrates an enhanced Stable Diffusion model with Mask R-CNN. Specifically, the approach utilizes a Mask Encoder to optimize the Stable Diffusion architecture, employing the Structural Similarity Index Measure (SSIM) to evaluate sample quality. This process generates high-fidelity virtual samples to construct a hybrid dataset for training data augmentation. Furthermore, a resource isolation strategy is adopted to facilitate online detection using an improved semi-supervised Mask R-CNN framework. Experimental results demonstrate that the proposed scheme effectively resolves detection difficulties for eight defect types, including edge chipping and scratches. The method achieves an mAP50 of 81.5%, representing a nearly 47% improvement over baseline methods relying solely on real samples, thereby realizing high-precision and high-efficiency industrial defect detection. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

41 pages, 5796 KB  
Article
Comparative Analysis of R-CNN and YOLOv8 Segmentation Features for Tomato Ripening Stage Classification and Quality Estimation
by Ali Ahmad, Jaime Lloret, Lorena Parra, Sandra Sendra and Francesco Di Gioia
Horticulturae 2026, 12(2), 127; https://doi.org/10.3390/horticulturae12020127 - 23 Jan 2026
Viewed by 356
Abstract
Accurate classification of tomato ripening stages and quality estimation is pivotal for optimizing post-harvest management and ensuring market value. This study presents a rigorous comparative analysis of morphological and colorimetric features extracted via two state-of-the-art deep learning-based instance segmentation frameworks—Mask R-CNN and YOLOv8n-seg—and [...] Read more.
Accurate classification of tomato ripening stages and quality estimation is pivotal for optimizing post-harvest management and ensuring market value. This study presents a rigorous comparative analysis of morphological and colorimetric features extracted via two state-of-the-art deep learning-based instance segmentation frameworks—Mask R-CNN and YOLOv8n-seg—and their efficacy in machine learning-driven ripening stage classification and quality prediction. Using 216 fresh-market tomato fruits across four defined ripening stages, we extracted 27 image-derived features per model, alongside 12 laboratory-measured physio-morphological traits. Multivariate analyses revealed that R-CNN features capture nuanced colorimetric and structural variations, while YOLOv8 emphasizes morphological characteristics. Machine learning classifiers trained with stratified 10-fold cross-validation achieved up to 95.3% F1-score when combining both feature sets, with R-CNN and YOLOv8 alone attaining 96.9% and 90.8% accuracy, respectively. These findings highlight a trade-off between the superior precision of R-CNN and the real-time scalability of YOLOv8. Our results demonstrate the potential of integrating complementary segmentation-derived features with laboratory metrics to enable robust, non-destructive phenotyping. This work advances the application of vision-based machine learning in precision agriculture, facilitating automated, scalable, and accurate monitoring of fruit maturity and quality. Full article
(This article belongs to the Special Issue Sustainable Practices in Smart Greenhouses)
Show Figures

Figure 1

Back to TopTop