Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (772)

Search Parameters:
Keywords = pixel-labeling

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
35 pages, 4256 KiB  
Article
Automated Segmentation and Morphometric Analysis of Thioflavin-S-Stained Amyloid Deposits in Alzheimer’s Disease Brains and Age-Matched Controls Using Weakly Supervised Deep Learning
by Gábor Barczánfalvi, Tibor Nyári, József Tolnai, László Tiszlavicz, Balázs Gulyás and Karoly Gulya
Int. J. Mol. Sci. 2025, 26(15), 7134; https://doi.org/10.3390/ijms26157134 (registering DOI) - 24 Jul 2025
Abstract
Alzheimer’s disease (AD) involves the accumulation of amyloid-β (Aβ) plaques, whose quantification plays a central role in understanding disease progression. Automated segmentation of Aβ deposits in histopathological micrographs enables large-scale analyses but is hindered by the high cost of detailed pixel-level annotations. Weakly [...] Read more.
Alzheimer’s disease (AD) involves the accumulation of amyloid-β (Aβ) plaques, whose quantification plays a central role in understanding disease progression. Automated segmentation of Aβ deposits in histopathological micrographs enables large-scale analyses but is hindered by the high cost of detailed pixel-level annotations. Weakly supervised learning offers a promising alternative by leveraging coarse or indirect labels to reduce the annotation burden. We evaluated a weakly supervised approach to segment and analyze thioflavin-S-positive parenchymal amyloid pathology in AD and age-matched brains. Our pipeline integrates three key components, each designed to operate under weak supervision. First, robust preprocessing (including retrospective multi-image illumination correction and gradient-based background estimation) was applied to enhance image fidelity and support training, as models rely more on image features. Second, class activation maps (CAMs), generated by a compact deep classifier SqueezeNet, were used to identify, and coarsely localize amyloid-rich parenchymal regions from patch-wise image labels, serving as spatial priors for subsequent refinement without requiring dense pixel-level annotations. Third, a patch-based convolutional neural network, U-Net, was trained on synthetic data generated from micrographs based on CAM-derived pseudo-labels via an extensive object-level augmentation strategy, enabling refined whole-image semantic segmentation and generalization across diverse spatial configurations. To ensure robustness and unbiased evaluation, we assessed the segmentation performance of the entire framework using patient-wise group k-fold cross-validation, explicitly modeling generalization across unseen individuals, critical in clinical scenarios. Despite relying on weak labels, the integrated pipeline achieved strong segmentation performance with an average Dice similarity coefficient (≈0.763) and Jaccard index (≈0.639), widely accepted metrics for assessing segmentation quality in medical image analysis. The resulting segmentations were also visually coherent, demonstrating that weakly supervised segmentation is a viable alternative in histopathology, where acquiring dense annotations is prohibitively labor-intensive and time-consuming. Subsequent morphometric analyses on automatically segmented Aβ deposits revealed size-, structural complexity-, and global geometry-related differences across brain regions and cognitive status. These findings confirm that deposit architecture exhibits region-specific patterns and reflects underlying neurodegenerative processes, thereby highlighting the biological relevance and practical applicability of the proposed image-processing pipeline for morphometric analysis. Full article
Show Figures

Figure 1

30 pages, 4379 KiB  
Article
Cross-Platform Comparison of Generative Design Based on a Multi-Dimensional Cultural Gene Model of the Phoenix Pattern
by Yali Wang, Xinxiong Liu, Yan Gan, Yixiao Gong, Yuchen Xi and Lin Li
Appl. Sci. 2025, 15(15), 8170; https://doi.org/10.3390/app15158170 - 23 Jul 2025
Abstract
The rapid development of generative artificial intelligence has paved the way for a new approach to reproduce and intelligently generate traditional patterns digitally. This paper focuses on the traditional Chinese phoenix pattern and constructs a “Phoenix Pattern Multidimensional Cultural Gene Model” based on [...] Read more.
The rapid development of generative artificial intelligence has paved the way for a new approach to reproduce and intelligently generate traditional patterns digitally. This paper focuses on the traditional Chinese phoenix pattern and constructs a “Phoenix Pattern Multidimensional Cultural Gene Model” based on the grounded theory. It summarises seven semantic dimensions covering composition pattern, pixel configuration, colour system, media technology, semantic implication, theme context, and application scenario and divides them into explicit and implicit cultural genes. The study further proposes a control mechanism of “semantic label–prompt–image generation”, constructs a cross-platform prompt structure system suitable for Midjourney and Dreamina AI, and completes 28 groups of prompt combinations and six rounds of iterative experiments. The analysis of the results from 64 user questionnaires and 10 expert ratings reveals that Dreamina AI excels in cultural semantic restoration and context recognition. In contrast, Midjourney has an advantage in composition coordination and aesthetic consistency. Overall, the study verified the effectiveness of the cultural gene model in generating AIGC control. It proposed a framework for generating innovative traditional patterns, providing a theoretical basis and practical support for the intelligent expression of cultural heritage. Full article
Show Figures

Figure 1

15 pages, 1794 KiB  
Article
Lightweight Dual-Attention Network for Concrete Crack Segmentation
by Min Feng and Juncai Xu
Sensors 2025, 25(14), 4436; https://doi.org/10.3390/s25144436 - 16 Jul 2025
Viewed by 166
Abstract
Structural health monitoring in resource-constrained environments demands crack segmentation models that match the accuracy of heavyweight convolutional networks while conforming to the power, memory, and latency limits of watt-level edge devices. This study presents a lightweight dual-attention network, which is a four-stage U-Net [...] Read more.
Structural health monitoring in resource-constrained environments demands crack segmentation models that match the accuracy of heavyweight convolutional networks while conforming to the power, memory, and latency limits of watt-level edge devices. This study presents a lightweight dual-attention network, which is a four-stage U-Net compressed to one-quarter of the channel depth and augmented—exclusively at the deepest layer—with a compact dual-attention block that couples channel excitation with spatial self-attention. The added mechanism increases computation by only 19%, limits the weight budget to 7.4 MB, and remains fully compatible with post-training INT8 quantization. On a pixel-labelled concrete crack benchmark, the proposed network achieves an intersection over union of 0.827 and an F1 score of 0.905, thus outperforming CrackTree, Hybrid 2020, MobileNetV3, and ESPNetv2. While refined weight initialization and Dice-augmented loss provide slight improvements, ablation experiments show that the dual-attention module is the main factor influencing accuracy. With 110 frames per second on a 10 W Jetson Nano and 220 frames per second on a 5 W Coral TPU achieved without observable accuracy loss, hardware-in-the-loop tests validate real-time viability. Thus, the proposed network offers cutting-edge crack segmentation at the kiloflop scale, thus facilitating ongoing, on-device civil infrastructure inspection. Full article
Show Figures

Figure 1

18 pages, 9981 KiB  
Article
Toward Adaptive Unsupervised and Blind Image Forgery Localization with ViT-VAE and a Gaussian Mixture Model
by Haichang Yin, KinTak U, Jing Wang and Wuyue Ma
Mathematics 2025, 13(14), 2285; https://doi.org/10.3390/math13142285 - 16 Jul 2025
Viewed by 159
Abstract
Most image forgery localization methods rely on supervised learning, requiring large labeled datasets for training. Recently, several unsupervised approaches based on the variational autoencoder (VAE) framework have been proposed for forged pixel detection. In these approaches, the latent space is built by a [...] Read more.
Most image forgery localization methods rely on supervised learning, requiring large labeled datasets for training. Recently, several unsupervised approaches based on the variational autoencoder (VAE) framework have been proposed for forged pixel detection. In these approaches, the latent space is built by a simple Gaussian distribution or a Gaussian Mixture Model. Despite their success, there are still some limitations: (1) A simple Gaussian distribution assumption in the latent space constrains performance due to the diverse distribution of forged images. (2) Gaussian Mixture Models (GMMs) introduce non-convex log-sum-exp functions in the Kullback–Leibler (KL) divergence term, leading to gradient instability and convergence issues during training. (3) Estimating GMM mixing coefficients typically involves either the expectation-maximization (EM) algorithm before VAE training or a multilayer perceptron (MLP), both of which increase computational complexity. To address these limitations, we propose the Deep ViT-VAE-GMM (DVVG) framework. First, we employ Jensen’s inequality to simplify the KL divergence computation, reducing gradient instability and improving training stability. Second, we introduce convolutional neural networks (CNNs) to adaptively estimate the mixing coefficients, enabling an end-to-end architecture while significantly lowering computational costs. Experimental results on benchmark datasets demonstrate that DVVG not only enhances VAE performance but also improves efficiency in modeling complex latent distributions. Our method effectively balances performance and computational feasibility, making it a practical solution for real-world image forgery localization. Full article
(This article belongs to the Special Issue Applied Mathematics in Data Science and High-Performance Computing)
Show Figures

Figure 1

29 pages, 8563 KiB  
Article
A Bridge Crack Segmentation Algorithm Based on Fuzzy C-Means Clustering and Feature Fusion
by Yadong Yao, Yurui Zhang, Zai Liu and Heming Yuan
Sensors 2025, 25(14), 4399; https://doi.org/10.3390/s25144399 - 14 Jul 2025
Viewed by 265
Abstract
In response to the limitations of traditional image processing algorithms, such as high noise sensitivity and threshold dependency in bridge crack detection, and the extensive labeled data requirements of deep learning methods, this study proposes a novel crack segmentation algorithm based on fuzzy [...] Read more.
In response to the limitations of traditional image processing algorithms, such as high noise sensitivity and threshold dependency in bridge crack detection, and the extensive labeled data requirements of deep learning methods, this study proposes a novel crack segmentation algorithm based on fuzzy C-means (FCM) clustering and multi-feature fusion. A three-dimensional feature space is constructed using B-channel pixels and fuzzy clustering with c = 3, justified by the distinct distribution patterns of these three regions in the image, enabling effective preliminary segmentation. To enhance accuracy, connected domain labeling combined with a circularity threshold is introduced to differentiate linear cracks from granular noise. Furthermore, a 5 × 5 neighborhood search strategy, based on crack pixel amplitude, is designed to restore the continuity of fragmented cracks. Experimental results on the Concrete Crack and SDNET2018 datasets demonstrate that the proposed algorithm achieves an accuracy of 0.885 and a recall rate of 0.891, outperforming DeepLabv3+ by 4.2%. Notably, with a processing time of only 0.8 s per image, the algorithm balances high accuracy with real-time efficiency, effectively addressing challenges, such as missed fine cracks and misjudged broken cracks in noisy environments by integrating geometric features and pixel distribution characteristics. This study provides an efficient unsupervised solution for bridge damage detection. Full article
Show Figures

Figure 1

14 pages, 6691 KiB  
Article
Remote Sensing Extraction of Damaged Buildings in the Shigatse Earthquake, 2025: A Hybrid YOLO-E and SAM2 Approach
by Zhimin Wu, Chenyao Qu, Wei Wang, Zelang Miao and Huihui Feng
Sensors 2025, 25(14), 4375; https://doi.org/10.3390/s25144375 - 12 Jul 2025
Viewed by 223
Abstract
In January 2025, a magnitude 6.8 earthquake struck Dingri County, Shigatse, Tibet, causing severe damage. Rapid and precise extraction of damaged buildings is essential for emergency relief and rebuilding efforts. This study proposes an approach integrating YOLO-E (Real-Time Seeing Anything) and the Segment [...] Read more.
In January 2025, a magnitude 6.8 earthquake struck Dingri County, Shigatse, Tibet, causing severe damage. Rapid and precise extraction of damaged buildings is essential for emergency relief and rebuilding efforts. This study proposes an approach integrating YOLO-E (Real-Time Seeing Anything) and the Segment Anything Model 2 (SAM2) to extract damaged buildings with multi-source remote sensing images, including post-earthquake Gaofen-7 imagery (0.80 m), Beijing-3 imagery (0.30 m), and pre-earthquake Google satellite imagery (0.15 m), over the affected region. In this hybrid approach, YOLO-E functions as the preliminary segmentation module for initial segmentation. It leverages its real-time detection and segmentation capability to locate potential damaged building regions and generate coarse segmentation masks rapidly. Subsequently, SAM2 follows as a refinement step, incorporating shapefile information from pre-disaster sources to apply precise, pixel-level segmentation. The dataset used for training contained labeled examples of damaged buildings, and the model optimization was carried out using stochastic gradient descent (SGD), with cross-entropy and mean squared error as the selected loss functions. Upon evaluation, the model reached a precision of 0.840, a recall of 0.855, an F1-score of 0.847, and an IoU of 0.735. It successfully extracted 492 suspected damaged building patches within a radius of 20 km from the earthquake epicenter, clearly showing the distribution characteristics of damaged buildings concentrated in the earthquake fault zone. In summary, this hybrid YOLO-E and SAM2 approach, leveraging multi-source remote sensing imagery, delivers precise and rapid extraction of damaged buildings with a precision of 0.840, recall of 0.855, and IoU of 0.735, effectively supporting targeted earthquake rescue and post-disaster reconstruction efforts in the Dingri County fault zone. Full article
Show Figures

Figure 1

14 pages, 29613 KiB  
Article
Unsupervised Insulator Defect Detection Method Based on Masked Autoencoder
by Yanying Song and Wei Xiong
Sensors 2025, 25(14), 4271; https://doi.org/10.3390/s25144271 - 9 Jul 2025
Viewed by 210
Abstract
With the rapid expansion of high-speed rail infrastructure, maintaining the structural integrity of insulators is critical to operational safety. However, conventional defect detection techniques typically rely on extensive labeled datasets, struggle with class imbalance, and often fail to capture large-scale structural anomalies. In [...] Read more.
With the rapid expansion of high-speed rail infrastructure, maintaining the structural integrity of insulators is critical to operational safety. However, conventional defect detection techniques typically rely on extensive labeled datasets, struggle with class imbalance, and often fail to capture large-scale structural anomalies. In this paper, we present an unsupervised insulator defect detection framework based on a masked autoencoder (MAE) architecture. Built upon a vision transformer (ViT), the model employs an asymmetric encoder-decoder structure and leverages a high-ratio random masking scheme during training to facilitate robust representation learning. At inference, a dual-pass interval masking strategy enhances defect localization accuracy. Benchmark experiments across multiple datasets demonstrate that our method delivers competitive image- and pixel-level performance while significantly reducing computational overhead compared to existing ViT-based approaches. By enabling high-precision defect detection through image reconstruction without requiring manual annotations, this approach offers a scalable and efficient solution for real-time industrial inspection under limited supervision. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

30 pages, 4399 KiB  
Article
Confident Learning-Based Label Correction for Retinal Image Segmentation
by Tanatorn Pethmunee, Supaporn Kansomkeat, Patama Bhurayanontachai and Sathit Intajag
Diagnostics 2025, 15(14), 1735; https://doi.org/10.3390/diagnostics15141735 - 8 Jul 2025
Viewed by 266
Abstract
Background/Objectives: In automatic medical image analysis, particularly for diabetic retinopathy, the accuracy of labeled data is crucial, as label noise can significantly complicate the analysis and lead to diagnostic errors. To tackle the issue of label noise in retinal image segmentation, an innovative [...] Read more.
Background/Objectives: In automatic medical image analysis, particularly for diabetic retinopathy, the accuracy of labeled data is crucial, as label noise can significantly complicate the analysis and lead to diagnostic errors. To tackle the issue of label noise in retinal image segmentation, an innovative label correction framework is introduced that combines Confident Learning (CL) with a human-in-the-loop re-annotation process to meticulously detect and rectify pixel-level labeling inaccuracies. Methods: Two CL-oriented strategies are assessed: Confident Joint Analysis (CJA) employing DeeplabV3+ with a ResNet-50 architecture, and Prune by Noise Rate (PBNR) utilizing ResNet-18. These methodologies are implemented on four publicly available retinal image datasets: HRF, STARE, DRIVE, and CHASE_DB1. After the models have been trained on the original labeled datasets, label noise is quantified, and amendments are executed on suspected misclassified pixels prior to the assessment of model performance. Results: The reduction in label noise yielded consistent advancements in accuracy, Intersection over Union (IoU), and weighted IoU across all the datasets. The segmentation of tiny structures, such as the fovea, demonstrated a significant enhancement following refinement. The Mean Boundary F1 Score (MeanBFScore) remained invariant, signifying the maintenance of boundary integrity. CJA and PBNR demonstrated strengths under different conditions, producing variations in performance that were dependent on the noise level and dataset characteristics. CL-based label correction techniques, when amalgamated with human refinement, could significantly enhance the segmentation accuracy and evaluation robustness for Accuracy, IoU, and MeanBFScore, achieving values of 0.9156, 0.8037, and 0.9856, respectively, with regard to the original ground truth, reflecting increases of 4.05%, 9.95%, and 1.28% respectively. Conclusions: This methodology represents a feasible and scalable solution to the challenge of label noise in medical image analysis, holding particular significance for real-world clinical applications. Full article
Show Figures

Figure 1

17 pages, 7477 KiB  
Article
The Development of a Lane Identification and Assessment Framework for Maintenance Using AI Technology
by Hohyuk Na, Do Gyeong Kim, Ji Min Kang and Chungwon Lee
Appl. Sci. 2025, 15(13), 7410; https://doi.org/10.3390/app15137410 - 1 Jul 2025
Viewed by 338
Abstract
This study proposes a vision-based framework to support AVs in maintaining stable lane-keeping by assessing the condition of lane markings. Unlike existing infrastructure standards focused on human visibility, this study addresses the need for criteria suited to sensor-based AV environments. Using real driving [...] Read more.
This study proposes a vision-based framework to support AVs in maintaining stable lane-keeping by assessing the condition of lane markings. Unlike existing infrastructure standards focused on human visibility, this study addresses the need for criteria suited to sensor-based AV environments. Using real driving data from urban expressways in Seoul, a YOLOv5-based lane detection algorithm was developed and enhanced through multi-label annotation and data augmentation. The model achieved a mean average precision (mAP) of 97.4% and demonstrated strong generalization on external datasets such as KITTI and TuSimple. For lane condition assessment, a pixel occupancy–based method was applied, combined with Canny edge detection and morphological operations. A threshold of 80-pixel occupancy was used to classify lanes as intact or worn. The proposed framework reliably detected lane degradation under various road and lighting conditions. These results suggest that quantitative, image-based indicators can complement traditional standards and guide AV-oriented infrastructure policy. Limitations include a lack of adverse weather data and dataset-specific threshold sensitivity. Full article
Show Figures

Figure 1

18 pages, 1566 KiB  
Article
Synthesizing Remote Sensing Images from Land Cover Annotations via Graph Prior Masked Diffusion
by Kai Deng, Siyuan Wei, Shiyan Pang, Huiwei Jiang and Bo Su
Remote Sens. 2025, 17(13), 2254; https://doi.org/10.3390/rs17132254 - 30 Jun 2025
Viewed by 234
Abstract
Semantic image synthesis (SIS) in remote sensing aims to generate high-fidelity satellite imagery from land use/land cover (LULC) labels, supporting applications such as map updating, data augmentation, and environmental monitoring. However, the existing methods typically focus on pixel-level semantic-to-image translation, neglecting the spatial [...] Read more.
Semantic image synthesis (SIS) in remote sensing aims to generate high-fidelity satellite imagery from land use/land cover (LULC) labels, supporting applications such as map updating, data augmentation, and environmental monitoring. However, the existing methods typically focus on pixel-level semantic-to-image translation, neglecting the spatial and semantic relationships among land cover objects, which hinders accurate scene structure modeling. To address this challenge, we propose GMDiT, an enhanced conditional diffusion model that extends the masked DiT architecture with graph-prior modeling. By jointly incorporating relational graph structures and semantic labels, GMDiT explicitly captures the object-level spatial and semantic dependencies, thereby improving the contextual coherence and structural fidelity of the synthesized images. Specifically, to effectively capture inter-object dependencies, we first encode the semantics of each node using CLIP and then employ a simple yet effective graph transformer to model the spatial interactions among nodes. Additionally, we design a scene similarity sampling strategy for the reverse diffusion process, improving contextual alignment while maintaining generative diversity. Experiments on the OpenEarthMap dataset show that GMDiT achieves superior performance in terms of FID and other metrics, demonstrating its effectiveness and robustness in the generation of structured remote sensing images. Full article
(This article belongs to the Special Issue Fifth Anniversary of “AI Remote Sensing” Section)
Show Figures

Figure 1

31 pages, 31711 KiB  
Article
On the Usage of Deep Learning Techniques for Unmanned Aerial Vehicle-Based Citrus Crop Health Assessment
by Ana I. Gálvez-Gutiérrez, Frederico Afonso and Juana M. Martínez-Heredia
Remote Sens. 2025, 17(13), 2253; https://doi.org/10.3390/rs17132253 - 30 Jun 2025
Viewed by 345
Abstract
This work proposes an end-to-end solution for leaf segmentation, disease detection, and damage quantification, specifically focusing on citrus crops. The primary motivation behind this research is to enable the early detection of phytosanitary problems, which directly impact the productivity and profitability of Spanish [...] Read more.
This work proposes an end-to-end solution for leaf segmentation, disease detection, and damage quantification, specifically focusing on citrus crops. The primary motivation behind this research is to enable the early detection of phytosanitary problems, which directly impact the productivity and profitability of Spanish and Portuguese agricultural developments, while ensuring environmentally safe management practices. It integrates an onboard computing module for Unmanned Aerial Vehicles (UAVs) using a Raspberry Pi 4 with Global Positioning System (GPS) and camera modules, allowing the real-time geolocation of images in citrus croplands. To address the lack of public data, a comprehensive database was created and manually labelled at the pixel level to provide accurate training data for a deep learning approach. To reduce annotation effort, we developed a custom automation algorithm for pixel-wise labelling in complex natural backgrounds. A SegNet architecture with a Visual Geometry Group 16 (VGG16) backbone was trained for the semantic, pixel-wise segmentation of citrus foliage. The model was successfully integrated as a modular component within a broader system architecture and was tested with UAV-acquired images, demonstrating accurate disease detection and quantification, even under varied conditions. The developed system provides a robust tool for the efficient monitoring of citrus crops in precision agriculture. Full article
(This article belongs to the Special Issue Application of Satellite and UAV Data in Precision Agriculture)
Show Figures

Figure 1

27 pages, 4947 KiB  
Article
From Coarse to Crisp: Enhancing Tree Species Maps with Deep Learning and Satellite Imagery
by Taebin Choe, Seungpyo Jeon, Byeongcheol Kim and Seonyoung Park
Remote Sens. 2025, 17(13), 2222; https://doi.org/10.3390/rs17132222 - 28 Jun 2025
Viewed by 364
Abstract
Accurate, detailed, and up-to-date tree species distribution information is essential for effective forest management and environmental research. However, existing tree species maps face limitations in resolution and update cycle, making it difficult to meet modern demands. To overcome these limitations, this study proposes [...] Read more.
Accurate, detailed, and up-to-date tree species distribution information is essential for effective forest management and environmental research. However, existing tree species maps face limitations in resolution and update cycle, making it difficult to meet modern demands. To overcome these limitations, this study proposes a novel framework that utilizes existing medium-resolution national tree species maps as ‘weak labels’ and fuses multi-temporal Sentinel-2 and PlanetScope satellite imagery data. Specifically, a super-resolution (SR) technique, using PlanetScope imagery as a reference, was first applied to Sentinel-2 data to enhance its resolution to 2.5 m. Then, these enhanced Sentinel-2 bands were combined with PlanetScope bands to construct the final multi-spectral, multi-temporal input data. Deep learning (DL) model training data was constructed by strategically sampling information-rich pixels from the national tree species map. Applying the proposed methodology to Sobaeksan and Jirisan National Parks in South Korea, the performance of various machine learning (ML) and deep learning (DL) models was compared, including traditional ML (linear regression, random forest) and DL architectures (multilayer perceptron (MLP), spectral encoder block (SEB)—linear, and SEB-transformer). The MLP model demonstrated optimal performance, achieving over 85% overall accuracy (OA) and more than 81% accuracy in classifying spectrally similar and difficult-to-distinguish species, specifically Quercus mongolica (QM) and Quercus variabilis (QV). Furthermore, while spectral and temporal information were confirmed to contribute significantly to tree species classification, the contribution of spatial (texture) information was experimentally found to be limited at the 2.5 m resolution level. This study presents a practical method for creating high-resolution tree species maps scalable to the national level by fusing existing tree species maps with Sentinel-2 and PlanetScope imagery without requiring costly separate field surveys. Its significance lies in establishing a foundation that can contribute to various fields such as forest resource management, biodiversity conservation, and climate change research. Full article
(This article belongs to the Special Issue Digital Modeling for Sustainable Forest Management)
Show Figures

Figure 1

28 pages, 5886 KiB  
Article
Burned Area Detection in the Eastern Canadian Boreal Forest Using a Multi-Layer Perceptron and MODIS-Derived Features
by Hadi Mahmoudi Meimand, Jiaxin Chen, Daniel Kneeshaw, Mohammadreza Bakhtyari and Changhui Peng
Remote Sens. 2025, 17(13), 2162; https://doi.org/10.3390/rs17132162 - 24 Jun 2025
Viewed by 289
Abstract
Wildfires play a critical role in boreal forest ecosystems, yet their increasing frequency poses significant challenges for carbon emissions, ecosystem stability, and fire management. Accurate burned area detection is essential for assessing post-fire landscape recovery and fire-induced carbon fluxes. This study develops, compares, [...] Read more.
Wildfires play a critical role in boreal forest ecosystems, yet their increasing frequency poses significant challenges for carbon emissions, ecosystem stability, and fire management. Accurate burned area detection is essential for assessing post-fire landscape recovery and fire-induced carbon fluxes. This study develops, compares, and optimizes machine learning (ML)-based models for burned area classification in the eastern Canadian boreal forest from 2000 to 2023 using MODIS-derived features extracted from Google Earth Engine (GEE), and the feature extraction includes maximum, minimum, mean, and median values per feature to enhance spectral representation and reduce noise. The dataset was randomly split into training (70%), validation (15%), and testing (15%) sets for model development and assessment. Combined labels were used due to class imbalance, and the model performance was assessed using kappa and the F1-score. Among the ML techniques tested, deep learning (DL) with a Multi-Layer Perceptron (MLP) outperformed Support Vector Machines (SVMs) and Random Forest (RF) by demonstrating superior classification accuracy in detecting burned area. It achieved an F1-score of 0.89 for burned pixels, confirming its potential for improving the long-term wildfire monitoring and management in boreal forests. Despite the computational demands of processing large-scale remote sensing data at 250 m resolution, the MLP modeling approach that we used provides an efficient, effective, and scalable solution for long-term burned area detection. These findings underscore the importance of tuning both network architecture and regularization parameters to improve the classification of burned pixels, enhancing the model robustness and generalizability. Full article
Show Figures

Figure 1

11 pages, 3502 KiB  
Technical Note
Defect Detection and Error Source Tracing in Laser Marking of Silicon Wafers with Machine Learning
by Hsiao-Chung Wang, Teng-To Yu and Wen-Fei Peng
Appl. Sci. 2025, 15(13), 7020; https://doi.org/10.3390/app15137020 - 22 Jun 2025
Viewed by 630
Abstract
Laser marking on wafers can introduce various defects such as inconsistent mark quality; under- or over-etching, and misalignment. Excessive laser power and inadequate cooling can cause burning or warping. These defects were inspected using machine vision, confocal microscopy, optical and scanning electron microscopy, [...] Read more.
Laser marking on wafers can introduce various defects such as inconsistent mark quality; under- or over-etching, and misalignment. Excessive laser power and inadequate cooling can cause burning or warping. These defects were inspected using machine vision, confocal microscopy, optical and scanning electron microscopy, acoustic/ultrasonic methods, and inline monitoring and coaxial vision. Machine learning has been successfully applied to improve the classification accuracy, and we propose a random forest algorithm with a training database to not only detect the defect but also trace its cause. Four causes have been identified as follows: unstable laser power, a dirty laser head, platform shaking, and voltage fluctuation of the electrical power. The object-matching technique ensures that a visible image can be utilized without a precise location. All inspected images were compared to the standard (qualified) product image pixel-by-pixel, and then the 2D matrix pattern for each type of defect was gathered. There were 10 photos for each type of defect included in the training to build the model with various labels, and the synthetic testing images altered by the defect cause model for laser marking defect inspection had accuracies of 97.0% and 91.6% in sorting the error cause, respectively Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

15 pages, 3069 KiB  
Article
Research on Weakly Supervised Face Segmentation Technology Based on Visual Large Models in New Media Post-Production
by Baihui Tang and Sanxing Cao
Appl. Sci. 2025, 15(12), 6843; https://doi.org/10.3390/app15126843 - 18 Jun 2025
Viewed by 232
Abstract
Face segmentation is a critical component in new media post-production, enabling the precise separation of facial regions from complex backgrounds at the pixel level. With the increasing demand for flexible and efficient segmentation solutions across diverse media scenarios—such as variety shows, period dramas, [...] Read more.
Face segmentation is a critical component in new media post-production, enabling the precise separation of facial regions from complex backgrounds at the pixel level. With the increasing demand for flexible and efficient segmentation solutions across diverse media scenarios—such as variety shows, period dramas, and other productions—there is a pressing need for adaptable methods that can perform reliably under varying conditions. However, existing approaches primarily depend on fully supervised learning, which requires extensive manual annotation and incurs high labor costs. To overcome these limitations, we propose a novel weakly supervised face segmentation framework that leverages large-scale vision models to automatically generate high-quality pseudo-labels. These pseudo-labels are then used to train segmentation networks in a dual-model architecture, where two complementary models collaboratively enhance segmentation performance. Our method significantly reduces the reliance on manual labeling while maintaining competitive accuracy. Extensive experiments demonstrate that our approach not only improves segmentation precision and efficiency but also streamlines post-production workflows, lowering human effort and accelerating project timelines. Furthermore, this framework reduced reliance on annotations in the field of weakly supervised learning for facial image processing in the new media post-production scenario. Full article
Show Figures

Figure 1

Back to TopTop