MDPI - Publisher of Open Access Journals

15 pages, 3194 KB

Open AccessArticle

Detection of Microplastics in Coastal Environments Based on Semantic Segmentation

by Javier Lorenzo-Navarro, José Salas-Cáceres, Modesto Castrillón-Santana, May Gómez and Alicia Herrera

Microplastics 2026, 5(2), 66; https://doi.org/10.3390/microplastics5020066 - 3 Apr 2026

Viewed by 681

Abstract

Microplastics represent an emerging threat to aquatic ecosystems, human health, and coastal aesthetics, with increasing concern about their accumulation on beaches due to ocean currents, wave action, and accidental spills. Despite their environmental impact, current methods for detecting and quantifying microplastics remain largely [...] Read more.

Microplastics represent an emerging threat to aquatic ecosystems, human health, and coastal aesthetics, with increasing concern about their accumulation on beaches due to ocean currents, wave action, and accidental spills. Despite their environmental impact, current methods for detecting and quantifying microplastics remain largely manual, time-consuming, and spatially limited. In this study, we propose a deep learning-based approach for the semantic segmentation of microplastics on sandy beaches, enabling pixel-level localization of small particles under real-world conditions. Twelve segmentation models were evaluated, including U-Net and its variants (Attention U-Net, ResUNet), as well as state-of-the-art architectures such as LinkNet, PAN, PSPNet, and YOLOv11 with segmentation heads. Models were trained and tested on augmented data patches, and their performance was assessed using Intersection over Union (IoU) and Dice coefficient metrics. LinkNet achieved the best performance with a Dice coefficient of 80% and an IoU of 72.6% on the test set, showing superior capability in segmenting microplastics even in the presence of visual clutter such as debris or sand variation. Qualitative results support the quantitative findings, highlighting the robustness of the model in complex scenes. Full article

(This article belongs to the Topic Plastic Contamination (Plastamination): An Environmental and Public Health-Related Concern)

► Show Figures

Figure 1

23 pages, 8222 KB

Open AccessFeature PaperArticle

HRSRD: A High-Resolution SAR Road Dataset and MSDA-LinkNet for Road Extraction with Multi-Scale Deformable Attention

by Jiaxin Ma, Dong Wang, Zhaoguo Deng, Yusen Li, Chenxi Xu, Zhigao Yang and Lihua Zhong

Electronics 2026, 15(6), 1236; https://doi.org/10.3390/electronics15061236 - 16 Mar 2026

Viewed by 409

Abstract

High-resolution synthetic aperture radar (SAR) imagery is essential for large-scale road extraction, yet it presents significant challenges due to inherent speckle noise, complex scattering effects, and the anisotropic nature of road structures. Moreover, the scarcity of large-scale, high-quality annotated SAR road datasets hinders [...] Read more.

High-resolution synthetic aperture radar (SAR) imagery is essential for large-scale road extraction, yet it presents significant challenges due to inherent speckle noise, complex scattering effects, and the anisotropic nature of road structures. Moreover, the scarcity of large-scale, high-quality annotated SAR road datasets hinders the development of deep learning-based methods. To address these issues, this paper first constructs a high-resolution SAR road dataset covering representative regions in the western United States. Road annotations are automatically generated using OpenStreetMap (OSM) vectors and then refined via a structure-guided alignment strategy. Building upon this dataset, we propose a novel framework termed Multi-Scale and Deformable-Attention LinkNet (MSDA-LinkNet), specifically designed to capture thin, direction-sensitive, and geometrically complex road features. The architecture integrates a parallel direction-aware multi-scale convolution module to explicitly model road anisotropy and scale variations, complemented by a deformable attention mechanism to adaptively aggregate contextual information along curved and irregular trajectories. Extensive experiments demonstrate that MSDA-LinkNet consistently outperforms representative approaches across key metrics, including Precision, F1-score, and Intersection over Union (IoU). The released dataset and benchmark provide a solid foundation for future research in high-resolution SAR-based road mapping. Full article

(This article belongs to the Special Issue New Challenges in Remote Sensing Image Processing)

► Show Figures

Figure 1

22 pages, 4095 KB

Open AccessFeature PaperArticle

Precise Extraction of Croplands from Remote Sensing Images in Egypt by a Dual-Encoder U-Net with Multi-Scale Axial Attention and Boundary Constraints

by Yong Li, Han Ding, Heiko Balzter, Vagner Ferreira, Ying Ge, Hongyan Wang, Huiyu Zhou, Tengbo Sun, Lulu Shi, Meiyun Lai and Xiuhui Liu

Land 2026, 15(2), 305; https://doi.org/10.3390/land15020305 - 11 Feb 2026

Viewed by 993

Abstract

Accurate cropland parcel mapping is essential for food security and sustainable land management in arid Africa, yet it remains challenging in Egypt due to edge blurring, spectral confusion, and fragmented fields in medium-resolution imagery. A novel dual-encoder deep learning method that integrates multi-scale [...] Read more.

Accurate cropland parcel mapping is essential for food security and sustainable land management in arid Africa, yet it remains challenging in Egypt due to edge blurring, spectral confusion, and fragmented fields in medium-resolution imagery. A novel dual-encoder deep learning method that integrates multi-scale axial attention and boundary constraints (MAA-BCNet) is proposed for the precise extraction of croplands in Egypt from Sentinel-2 multispectral images. A dual-path encoder is designed to fuse CNN-based local textures with an RMT global branch using spatial decay attention for complementary feature extraction. A multi-scale axial attention module is introduced to capture anisotropic parcel structures for improved spectral–spatial discrimination, and a multi-directional gradient edge enhancement module is developed for explicitly preserving boundary integrity. A U-Net++ decoder is employed for dense multi-scale aggregation. Experimental results in Egypt demonstrate that MAA-BCNet achieves superior performance in delineating cropland parcels, particularly for irregular or fragmented croplands with complex landscapes and fuzzy boundaries. Compared with the widely used segmentation models such as DeepLabV3_plus, PSPnet, Link_net, FCN_resnet101, and U-Net++ under the same training and evaluation settings, our model has the best performance, with Recall, Precision, IoU, and F1-Score reaching 94.92%, 90.77%, 86.57%, and 92.80%, respectively. These advancements make MAA-BCNet suitable for cropland mapping of large areas of Egypt, with applications in precision agriculture and sustainable land management. Full article

(This article belongs to the Topic Applications of Artificial Intelligence Models and Spatiotemporal Data in Agriculture and the Ecological Environment)

► Show Figures

Figure 1

21 pages, 4359 KB

Open AccessArticle

Identification of NAPL Contamination Occurrence States in Low-Permeability Sites Using UNet Segmentation and Electrical Resistivity Tomography

by Mengwen Gao, Yu Xiao and Xiaolei Zhang

Appl. Sci. 2025, 15(13), 7109; https://doi.org/10.3390/app15137109 - 24 Jun 2025

Viewed by 1123

Abstract

To address the challenges in identifying NAPL contamination within low-permeability clay sites, this study innovatively integrates high-density electrical resistivity tomography (ERT) with a UNet deep learning model to establish an intelligent contamination detection system. Taking an industrial site in Shanghai as the research [...] Read more.

To address the challenges in identifying NAPL contamination within low-permeability clay sites, this study innovatively integrates high-density electrical resistivity tomography (ERT) with a UNet deep learning model to establish an intelligent contamination detection system. Taking an industrial site in Shanghai as the research object, we collected apparent resistivity data using the WGMD-9 system, obtained resistivity profiles through inversion imaging, and constructed training sets by generating contamination labels via K-means clustering. A semantic segmentation model with skip connections and multi-scale feature fusion was developed based on the UNet architecture to achieve automatic identification of contaminated areas. Experimental results demonstrate that the model achieves a mean Intersection over Union (mIoU) of 86.58%, an accuracy (Acc) of 99.42%, a precision (Pre) of 75.72%, a recall (Rec) of 76.80%, and an F1 score (f1) of 76.23%, effectively overcoming the noise interference in electrical anomaly interpretation through conventional geophysical methods in low-permeability clay, while outperforming DeepLabV3, DeepLabV3+, PSPNet, and LinkNet models. Time-lapse resistivity imaging verifies the feasibility of dynamic monitoring for contaminant migration, while the integration of the VGG-16 encoder and hyperparameter optimization (learning rate of 0.0001 and batch size of 8) significantly enhances model performance. Case visualization reveals high consistency between segmentation results and actual contamination distribution, enabling precise localization of spatial morphology for contamination plumes. This technological breakthrough overcomes the high-cost and low-efficiency limitations of traditional borehole sampling, providing a high-precision, non-destructive intelligent detection solution for contaminated site remediation. Full article

► Show Figures

Figure 1

26 pages, 7349 KB

Open AccessArticle

Enhancing DeepLabv3+ Convolutional Neural Network Model for Precise Apple Orchard Identification Using GF-6 Remote Sensing Images and PIE-Engine Cloud Platform

by Guining Gao, Zhihan Chen, Yicheng Wei, Xicun Zhu and Xinyang Yu

Remote Sens. 2025, 17(11), 1923; https://doi.org/10.3390/rs17111923 - 31 May 2025

Cited by 2 | Viewed by 2494

Abstract

Utilizing remote sensing models to monitor apple orchards facilitates the industrialization of agriculture and the sustainable development of rural land resources. This study enhanced the DeepLabv3+ model to achieve superior performance in apple orchard identification by incorporating ResNet, optimizing the algorithm, and adjusting [...] Read more.

Utilizing remote sensing models to monitor apple orchards facilitates the industrialization of agriculture and the sustainable development of rural land resources. This study enhanced the DeepLabv3+ model to achieve superior performance in apple orchard identification by incorporating ResNet, optimizing the algorithm, and adjusting hyperparameter configuration using the PIE-Engine cloud platform. GF-6 PMS images were used as the data source, and Qixia City was selected as the case study area for demonstration. The results indicate that the accuracies of apple orchard identification using the proposed DeepLabv3+_34, DeepLabv3+_50, and DeepLabv3+_101 reached 91.17%, 92.55%, and 94.37%, respectively. DeepLabv3+_101 demonstrated superior identification performance for apple orchards compared with ResU-Net and LinkNet, with an average accuracy improvement of over 3%. The identified area of apple orchards using the DeepLabv3+_101 model was 629.32 km², accounting for 31.20% of Qixia City’s total area; apple orchards were mainly located in the western part of the study area. The innovation of this research lies in combining image annotation and object-oriented methods during training, improving annotation efficiency and accuracy. Additionally, an enhanced DeepLabv3+ model was constructed based on GF-6 satellite images and the PIE-Engine cloud platform, exhibiting superior performance in feature expression compared with conventional machine learning classification and recognition algorithms. Full article

(This article belongs to the Special Issue Remote Sensing Image Classification: Theory and Application)

► Show Figures

Figure 1

17 pages, 7698 KB

Open AccessEditor’s ChoiceArticle

Plant Disease Segmentation Networks for Fast Automatic Severity Estimation Under Natural Field Scenarios

by Chenyi Zhao, Changchun Li, Xin Wang, Xifang Wu, Yongquan Du, Huabin Chai, Taiyi Cai, Hengmao Xiang and Yinghua Jiao

Agriculture 2025, 15(6), 583; https://doi.org/10.3390/agriculture15060583 - 10 Mar 2025

Cited by 14 | Viewed by 4404

Abstract

The segmentation of plant disease images enables researchers to quantify the proportion of disease spots on leaves, known as disease severity. Current deep learning methods predominantly focus on single diseases, simple lesions, or laboratory-controlled environments. In this study, we established and publicly released [...] Read more.

The segmentation of plant disease images enables researchers to quantify the proportion of disease spots on leaves, known as disease severity. Current deep learning methods predominantly focus on single diseases, simple lesions, or laboratory-controlled environments. In this study, we established and publicly released image datasets of field scenarios for three diseases: soybean bacterial blight (SBB), wheat stripe rust (WSR), and cedar apple rust (CAR). We developed Plant Disease Segmentation Networks (PDSNets) based on LinkNet with ResNet-18 as the encoder, including three versions: ×1.0, ×0.75, and ×0.5. The ×1.0 version incorporates a 4 × 4 embedding layer to enhance prediction speed, while versions ×0.75 and ×0.5 are lightweight variants with reduced channel numbers within the same architecture. Their parameter counts are 11.53 M, 6.50 M, and 2.90 M, respectively. PDSNetx0.5 achieved an overall F1 score of 91.96%, an Intersection over Union (IoU) of 85.85% for segmentation, and a coefficient of determination (R²) of 0.908 for severity estimation. On a local central processing unit (CPU), PDSNetx0.5 demonstrated a prediction speed of 34.18 images (640 × 640 pixels) per second, which is 2.66 times faster than LinkNet. Our work provides an efficient and automated approach for assessing plant disease severity in field scenarios. Full article

(This article belongs to the Special Issue Computational, AI and IT Solutions Helping Agriculture)

► Show Figures

Figure 1

30 pages, 34873 KB

Open AccessArticle

Text-Guided Synthesis in Medical Multimedia Retrieval: A Framework for Enhanced Colonoscopy Image Classification and Segmentation

by Ojonugwa Oluwafemi Ejiga Peter, Opeyemi Taiwo Adeniran, Adetokunbo MacGregor John-Otumu, Fahmi Khalifa and Md Mahmudur Rahman

Algorithms 2025, 18(3), 155; https://doi.org/10.3390/a18030155 - 9 Mar 2025

Cited by 11 | Viewed by 3222

Abstract

The lack of extensive, varied, and thoroughly annotated datasets impedes the advancement of artificial intelligence (AI) for medical applications, especially colorectal cancer detection. Models trained with limited diversity often display biases, especially when utilized on disadvantaged groups. Generative models (e.g., DALL-E 2, Vector-Quantized [...] Read more.

The lack of extensive, varied, and thoroughly annotated datasets impedes the advancement of artificial intelligence (AI) for medical applications, especially colorectal cancer detection. Models trained with limited diversity often display biases, especially when utilized on disadvantaged groups. Generative models (e.g., DALL-E 2, Vector-Quantized Generative Adversarial Network (VQ-GAN)) have been used to generate images but not colonoscopy data for intelligent data augmentation. This study developed an effective method for producing synthetic colonoscopy image data, which can be used to train advanced medical diagnostic models for robust colorectal cancer detection and treatment. Text-to-image synthesis was performed using fine-tuned Visual Large Language Models (LLMs). Stable Diffusion and DreamBooth Low-Rank Adaptation produce images that look authentic, with an average Inception score of 2.36 across three datasets. The validation accuracy of various classification models Big Transfer (BiT), Fixed Resolution Residual Next Generation Network (FixResNeXt), and Efficient Neural Network (EfficientNet) were 92%, 91%, and 86%, respectively. Vision Transformer (ViT) and Data-Efficient Image Transformers (DeiT) had an accuracy rate of 93%. Secondly, for the segmentation of polyps, the ground truth masks are generated using Segment Anything Model (SAM). Then, five segmentation models (U-Net, Pyramid Scene Parsing Network (PSNet), Feature Pyramid Network (FPN), Link Network (LinkNet), and Multi-scale Attention Network (MANet)) were adopted. FPN produced excellent results, with an Intersection Over Union (IoU) of

0.64

, an F1 score of

0.78

, a recall of

0.75

, and a Dice coefficient of 0.77. This demonstrates strong performance in terms of both segmentation accuracy and overlap metrics, with particularly robust results in balanced detection capability as shown by the high F1 score and Dice coefficient. This highlights how AI-generated medical images can improve colonoscopy analysis, which is critical for early colorectal cancer detection. Full article

(This article belongs to the Special Issue Algorithms and Applications of Machine Learning Techniques for Healthcare)

► Show Figures

Figure 1

20 pages, 52216 KB

Open AccessArticle

Semantic-to-Instance Segmentation of Time-Invariant Offshore Wind Farms Using Sentinel-1 Time Series and Time-Shift Augmentation

by Osmar Luiz Ferreira de Carvalho, Osmar Abílio de Carvalho Junior, Anesmar Olino de Albuquerque and Daniel Guerreiro e Silva

Energies 2025, 18(5), 1127; https://doi.org/10.3390/en18051127 - 25 Feb 2025

Cited by 1 | Viewed by 1234

Abstract

The rapid expansion of offshore wind energy requires effective monitoring to balance renewable energy development with environmental and marine spatial planning. This study proposes a novel offshore wind farm detection methodology integrating Sentinel-1 SAR time series, a time-shift augmentation strategy, and semantic-to-instance segmentation [...] Read more.

The rapid expansion of offshore wind energy requires effective monitoring to balance renewable energy development with environmental and marine spatial planning. This study proposes a novel offshore wind farm detection methodology integrating Sentinel-1 SAR time series, a time-shift augmentation strategy, and semantic-to-instance segmentation transformation. The methodology consists of (1) constructing a dataset with offshore wind farms labeled from Sentinel-1 SAR time series, (2) applying a time-shift augmentation strategy by randomizing image sequences during training (avoiding overfitting due to chronological ordering), (3) evaluating six deep learning architectures (U-Net, U-Net++, LinkNet, DeepLabv3+, FPN, and SegFormer) across time-series lengths of 1, 5, 10, and 15 images, and (4) converting the semantic segmentation results into instance-level detections using Geographic Information System tools. The results show that increasing the time-series length from 1 to 15 images significantly improves performance, with the Intersection over Union increasing from 63.29% to 81.65% and the F-score from 77.52% to 89.90%, using the best model (LinkNet). Also, models trained with time-shift augmentation achieved a 25% higher IoU and an 18% higher F-score than those trained without it. The semantic-to-instance transformation achieved 99.7% overall quality in per-object evaluation, highlighting the effectiveness of our approach. Full article

(This article belongs to the Section A3: Wind, Wave and Tidal Energy)

► Show Figures

Figure 1

11 pages, 5932 KB

Open AccessArticle

Automating an Encoder–Decoder Incorporated Ensemble Model: Semantic Segmentation Workflow on Low-Contrast Underwater Images

by Jale Bektaş

Appl. Sci. 2024, 14(24), 11964; https://doi.org/10.3390/app142411964 - 20 Dec 2024

Cited by 2 | Viewed by 1390

Abstract

Numerous methods have been proposed for semantic segmentation and the state-of-the-art part is likely to be incorporated by deep learning-based methods which show a salient performance. This study addresses the challenge of semantic segmentation in low-contrast imbalanced underwater images. Moreover, it employs nine [...] Read more.

Numerous methods have been proposed for semantic segmentation and the state-of-the-art part is likely to be incorporated by deep learning-based methods which show a salient performance. This study addresses the challenge of semantic segmentation in low-contrast imbalanced underwater images. Moreover, it employs nine model fusions as a downstream workflow task using encoder–decoder architectures with Dice Loss and Focal Loss training focusing on the imbalance data. Afterwards, the most effective two encoder–decoder fusion models, Res34+Unet and VGG19+FPN, by 0.592%, 0.590% mIoU on average and by 0.510%, 0.491% F1-score yielded better performance, respectively, than other models. Using a weight-optimization algorithm, the ensemble model with recreated IoU results improves the accuracy for both the Res34+Unet and the VGG19+FPN models, by 0.652% mIoU on average which is 6%. The ensemble model combines the model performances of independent models by considering their superior inference accuracy on a per-class basis separately and improves the model performances by emphasizing the better one on a per-class basis. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

19 pages, 5047 KB

Open AccessFeature PaperArticle

A Convolutional Neural Network for the Removal of Simultaneous Ocular and Myogenic Artifacts from EEG Signals

by Maryam Azhar, Tamoor Shafique and Anas Amjad

Electronics 2024, 13(22), 4576; https://doi.org/10.3390/electronics13224576 - 20 Nov 2024

Cited by 6 | Viewed by 4242

Abstract

Electroencephalography (EEG) is a non-invasive technique widely used in neuroscience to diagnose neural disorders and analyse brain activity. However, ocular and myogenic artifacts from eye movements and facial muscle activity often contaminate EEG signals, compromising signal analysis accuracy. While deep learning models are [...] Read more.

Electroencephalography (EEG) is a non-invasive technique widely used in neuroscience to diagnose neural disorders and analyse brain activity. However, ocular and myogenic artifacts from eye movements and facial muscle activity often contaminate EEG signals, compromising signal analysis accuracy. While deep learning models are a popular choice for denoising EEG signals, most focus on removing either ocular or myogenic artifacts independently. This paper introduces a novel EEG denoising model capable of handling the simultaneous occurrence of both artifacts. The model uses convolutional layers to extract spatial features and a fully connected layer to reconstruct clean signals from learned features. The model integrates the Adam optimiser, average pooling, and ReLU activation to effectively capture and restore clean EEG signals. It demonstrates superior performance, achieving low training and validation losses with a significantly reduced

R R M S E

value of 0.35 in both the temporal and spectral domains. A high cross-correlation coefficient of 0.94 with ground-truth EEG signals confirms the model’s fidelity. Compared to the existing architectures and models (FPN, UNet, MCGUNet, LinkNet, MultiResUNet3+, Simple CNN, Complex CNN) across a range of signal-to-noise ratio values, the model shows superior performance for artifact removal. It also mitigates overfitting, underscoring its robustness in artifact suppression. Full article

► Show Figures

Figure 1

26 pages, 19104 KB

Open AccessArticle

Accurately Segmenting/Mapping Tobacco Seedlings Using UAV RGB Images Collected from Different Geomorphic Zones and Different Semantic Segmentation Models

by Qianxia Li, Zhongfa Zhou, Yuzhu Qian, Lihui Yan, Denghong Huang, Yue Yang and Yining Luo

Plants 2024, 13(22), 3186; https://doi.org/10.3390/plants13223186 - 13 Nov 2024

Cited by 4 | Viewed by 1958

Abstract

The tobacco seedling stage is a crucial period for tobacco cultivation. Accurately extracting tobacco seedlings from satellite images can effectively assist farmers in replanting, precise fertilization, and subsequent yield estimation. However, in complex Karst mountainous areas, it is extremely challenging to accurately segment [...] Read more.

The tobacco seedling stage is a crucial period for tobacco cultivation. Accurately extracting tobacco seedlings from satellite images can effectively assist farmers in replanting, precise fertilization, and subsequent yield estimation. However, in complex Karst mountainous areas, it is extremely challenging to accurately segment tobacco plants due to a variety of factors, such as the topography, the planting environment, and difficulties in obtaining high-resolution image data. Therefore, this study explores an accurate segmentation model for detecting tobacco seedlings from UAV RGB images across various geomorphic partitions, including dam and hilly areas. It explores a family of tobacco plant seedling segmentation networks, namely, U-Net, U-Net++, Linknet, PSPNet, MAnet, FPN, PAN, and DeepLabV3+, using the Hill Seedling Tobacco Dataset (HSTD), the Dam Area Seedling Tobacco Dataset (DASTD), and the Hilly Dam Area Seedling Tobacco Dataset (H-DASTD) for model training. To validate the performance of the semantic segmentation models for crop segmentation in the complex cropping environments of Karst mountainous areas, this study compares and analyzes the predicted results with the manually labeled true values. The results show that: (1) the accuracy of the models in segmenting tobacco seedling plants in the dam area is much higher than that in the hilly area, with the mean values of mIoU, PA, Precision, Recall, and the Kappa Coefficient reaching 87%, 97%, 91%, 85%, and 0.81 in the dam area and 81%, 97%, 72%, 73%, and 0.73 in the hilly area, respectively; (2) The segmentation accuracies of the models differ significantly across different geomorphological zones; the U-Net segmentation results are optimal for the dam area, with higher values of mIoU (93.83%), PA (98.83%), Precision (93.27%), Recall (96.24%), and the Kappa Coefficient (0.9440) than those of the other models; in the hilly area, the U-Net++ segmentation performance is better than that of the other models, with mIoU and PA of 84.17% and 98.56%, respectively; (3) The diversity of tobacco seedling samples affects the model segmentation accuracy, as shown by the Kappa Coefficient, with H-DASTD (0.901) > DASTD (0.885) > HSTD (0.726); (4) With regard to the factors affecting missed segregation, although the factors affecting the dam area and the hilly area are different, the main factors are small tobacco plants (STPs) and weeds for both areas. This study shows that the accurate segmentation of tobacco plant seedlings in dam and hilly areas based on UAV RGB images and semantic segmentation models can be achieved, thereby providing new ideas and technical support for accurate crop segmentation in Karst mountainous areas. Full article

(This article belongs to the Special Issue Application of Remote Sensing in Crop Production and Farmland Soil Monitoring)

► Show Figures

Figure 1

21 pages, 5465 KB

Open AccessArticle

Deep Learning Approaches for Wildfire Severity Prediction: A Comparative Study of Image Segmentation Networks and Visual Transformers on the EO4WildFires Dataset

by Dimitris Sykas, Dimitrios Zografakis and Konstantinos Demestichas

Fire 2024, 7(11), 374; https://doi.org/10.3390/fire7110374 - 23 Oct 2024

Cited by 11 | Viewed by 6152

Abstract

This paper investigates the applicability of deep learning models for predicting the severity of forest wildfires, utilizing an innovative benchmark dataset called EO4WildFires. EO4WildFires integrates multispectral imagery from Sentinel-2, SAR data from Sentinel-1, and meteorological data from NASA Power annotated with EFFIS data [...] Read more.

This paper investigates the applicability of deep learning models for predicting the severity of forest wildfires, utilizing an innovative benchmark dataset called EO4WildFires. EO4WildFires integrates multispectral imagery from Sentinel-2, SAR data from Sentinel-1, and meteorological data from NASA Power annotated with EFFIS data for forest fire detection and size estimation. These data cover 45 countries with a total of 31,730 wildfire events from 2018 to 2022. All of these various sources of data are archived into data cubes, with the intention of assessing wildfire severity by considering both current and historical forest conditions, utilizing a broad range of data including temperature, precipitation, and soil moisture. The experimental setup has been arranged to test the effectiveness of different deep learning architectures in predicting the size and shape of wildfire-burned areas. This study incorporates both image segmentation networks and visual transformers, employing a consistent experimental design across various models to ensure the comparability of the results. Adjustments were made to the training data, such as the exclusion of empty labels and very small events, to refine the focus on more significant wildfire events and potentially improve prediction accuracy. The models’ performance was evaluated using metrics like F1 score, IoU score, and Average Percentage Difference (aPD). These metrics offer a multi-faceted view of model performance, assessing aspects such as precision, sensitivity, and the accuracy of the burned area estimation. Through extensive testing the final model utilizing LinkNet and ResNet-34 as backbones, we obtained the following metric results on the test set: 0.86 F1 score, 0.75 IoU, and 70% aPD. These results were obtained when all of the available samples were used. When the empty labels were absent during the training and testing, the model increased its performance significantly: 0.87 F1 score, 0.77 IoU, and 44.8% aPD. This indicates that the number of samples, as well as their respectively size (area), tend to have an impact on the model’s robustness. This restriction is well known in the remote sensing domain, as accessible, accurately labeled data may be limited. Visual transformers like TeleViT showed potential but underperformed compared to segmentation networks in terms of F1 and IoU scores. Full article

(This article belongs to the Special Issue Machine Learning-Based Wildfire Modeling: Unveiling Innovative Methodologies for Enhanced Fire Prediction and Analysis)

► Show Figures

Figure 1

20 pages, 6289 KB

Open AccessArticle

A High-Resolution Remote Sensing Road Extraction Method Based on the Coupling of Global Spatial Features and Fourier Domain Features

by Hui Yang, Caili Zhou, Xiaoyu Xing, Yongchuang Wu and Yanlan Wu

Remote Sens. 2024, 16(20), 3896; https://doi.org/10.3390/rs16203896 - 20 Oct 2024

Cited by 10 | Viewed by 3517

Abstract

Remote sensing road extraction based on deep learning is an important method for road extraction. However, in complex remote sensing images, different road information often exhibits varying frequency distributions and texture characteristics, and it is usually difficult to express the comprehensive characteristics of [...] Read more.

Remote sensing road extraction based on deep learning is an important method for road extraction. However, in complex remote sensing images, different road information often exhibits varying frequency distributions and texture characteristics, and it is usually difficult to express the comprehensive characteristics of roads effectively from a single spatial domain perspective. To address the aforementioned issues, this article proposes a road extraction method that couples global spatial learning with Fourier frequency domain learning. This method first utilizes a transformer to capture global road features and then applies Fourier transform to separate and enhance high-frequency and low-frequency information. Finally, it integrates spatial and frequency domain features to express road characteristics comprehensively and overcome the effects of intra-class differences and occlusions. Experimental results on HF, MS, and DeepGlobe road datasets show that our method can more comprehensively express road features compared with other deep learning models (e.g., Unet, D-Linknet, DeepLab-v3, DCSwin, SGCN) and extract road boundaries more accurately and coherently. The IOU accuracy of the extracted results also achieved 72.54%, 55.35%, and 71.87%. Full article

(This article belongs to the Special Issue Road Extraction and Distress Assessment by Spaceborne, Airborne and Terrestrial Platforms (Second Edition))

► Show Figures

Figure 1

19 pages, 7665 KB

Open AccessArticle

Chestnut Burr Segmentation for Yield Estimation Using UAV-Based Imagery and Deep Learning

by Gabriel A. Carneiro, Joaquim Santos, Joaquim J. Sousa, António Cunha and Luís Pádua

Drones 2024, 8(10), 541; https://doi.org/10.3390/drones8100541 - 1 Oct 2024

Cited by 5 | Viewed by 2205

Abstract

Precision agriculture (PA) has advanced agricultural practices, offering new opportunities for crop management and yield optimization. The use of unmanned aerial vehicles (UAVs) in PA enables high-resolution data acquisition, which has been adopted across different agricultural sectors. However, its application for decision support [...] Read more.

Precision agriculture (PA) has advanced agricultural practices, offering new opportunities for crop management and yield optimization. The use of unmanned aerial vehicles (UAVs) in PA enables high-resolution data acquisition, which has been adopted across different agricultural sectors. However, its application for decision support in chestnut plantations remains under-represented. This study presents the initial development of a methodology for segmenting chestnut burrs from UAV-based imagery to estimate its productivity in point cloud data. Deep learning (DL) architectures, including U-Net, LinkNet, and PSPNet, were employed for chestnut burr segmentation in UAV images captured at a 30 m flight height, with YOLOv8m trained for comparison. Two datasets were used for training and to evaluate the models: one newly introduced in this study and an existing dataset. U-Net demonstrated the best performance, achieving an F1-score of 0.56 and a counting accuracy of 0.71 on the proposed dataset, using a combination of both datasets during training. The primary challenge encountered was that burrs often tend to grow in clusters, leading to unified regions in segmentation, making object detection potentially more suitable for counting. Nevertheless, the results show that DL architectures can generate masks for point cloud segmentation, supporting precise chestnut tree production estimation in future studies. Full article

(This article belongs to the Special Issue Intelligent Processing and Application of UAV Remote Sensing Image Data)

► Show Figures

Figure 1

16 pages, 10348 KB

Open AccessArticle

Pointer Meter Reading Method Based on YOLOv8 and Improved LinkNet

by Xiaohu Lu, Shisong Zhu and Bibo Lu

Sensors 2024, 24(16), 5288; https://doi.org/10.3390/s24165288 - 15 Aug 2024

Cited by 6 | Viewed by 2839

Abstract

In order to improve the reading efficiency of pointer meter, this paper proposes a reading method based on LinkNet. Firstly, the meter dial area is detected using YOLOv8. Subsequently, the detected images are fed into the improved LinkNet segmentation network. In this network, [...] Read more.

In order to improve the reading efficiency of pointer meter, this paper proposes a reading method based on LinkNet. Firstly, the meter dial area is detected using YOLOv8. Subsequently, the detected images are fed into the improved LinkNet segmentation network. In this network, we replace traditional convolution with partial convolution, which reduces the number of model parameters while ensuring accuracy is not affected. Remove one pair of encoding and decoding modules to further compress the model size. In the feature fusion part of the model, the CBAM (Convolutional Block Attention Module) attention module is added and the direct summing operation is replaced by the AFF (Attention Feature Fusion) module, which enhances the feature extraction capability of the model for the segmented target. In the subsequent rotation correction section, this paper effectively addresses the issue of inaccurate prediction by CNN networks for axisymmetric images within the 0–360° range, by dividing the rotation angle prediction into classification and regression steps. It ensures that the final reading part receives the correct angle of image input, thereby improving the accuracy of the overall reading algorithm. The final experimental results indicate that our proposed reading method has a mean absolute error of 0.20 and a frame rate of 15. Full article

(This article belongs to the Section Industrial Sensors)

► Show Figures

Figure 1

Search Results (55)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (55)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI