MDPI - Publisher of Open Access Journals

27 pages, 658 KiB

Open AccessSystematic Review

Advances in the Automated Identification of Individual Tree Species: A Systematic Review of Drone- and AI-Based Methods in Forest Environments

by Ricardo Abreu-Dias, Juan M. Santos-Gago, Fernando Martín-Rodríguez and Luis M. Álvarez-Sabucedo

Technologies 2025, 13(5), 187; https://doi.org/10.3390/technologies13050187 - 6 May 2025

Viewed by 1079

Abstract

The classification and identification of individual tree species in forest environments are critical for biodiversity conservation, sustainable forestry management, and ecological monitoring. Recent advances in drone technology and artificial intelligence have enabled new methodologies for detecting and classifying trees at an individual level. [...] Read more.

The classification and identification of individual tree species in forest environments are critical for biodiversity conservation, sustainable forestry management, and ecological monitoring. Recent advances in drone technology and artificial intelligence have enabled new methodologies for detecting and classifying trees at an individual level. However, significant challenges persist, particularly in heterogeneous forest environments with high species diversity and complex canopy structures. This systematic review explores the latest research on drone-based data collection and AI-driven classification techniques, focusing on studies that classify specific tree species rather than generic tree detection. Following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines, peer review studies from the last decade were analyzed to identify trends in data acquisition instruments (e.g., RGB, multispectral, hyperspectral, LiDAR), preprocessing techniques, segmentation approaches, and machine learning (ML) algorithms used for classification. Findings of this study reveal that deep learning (DL) models, particularly convolutional neural networks (CNN), are increasingly replacing traditional ML methods such as random forest (RF) or support vector machines (SVMs) because there is no need for a feature extraction phase, as this is implicit in the DL models. The integration of LiDAR with hyperspectral imaging further enhances classification accuracy but remains limited due to cost constraints. Additionally, we discuss the challenges of model generalization across different forest ecosystems and propose future research directions, including the development of standardized datasets and improved model architectures for robust tree species classification. This review provides a comprehensive synthesis of existing methodologies, highlighting both advancements and persistent gaps in AI-driven forest monitoring. Full article

(This article belongs to the Collection Review Papers Collection for Advanced Technologies)

► Show Figures

Figure 1

28 pages, 13811 KiB

Open AccessArticle

MMTSCNet: Multimodal Tree Species Classification Network for Classification of Multi-Source, Single-Tree LiDAR Point Clouds

by Jan Richard Vahrenhold, Melanie Brandmeier and Markus Sebastian Müller

Remote Sens. 2025, 17(7), 1304; https://doi.org/10.3390/rs17071304 - 5 Apr 2025

Cited by 1 | Viewed by 790

Abstract

Trees play a critical role in climate regulation, biodiversity, and carbon storage as they cover approximately 30% of the global land area. Nowadays, Machine Learning (ML)is key to automating large-scale tree species classification based on active and passive sensing systems, with a recent [...] Read more.

Trees play a critical role in climate regulation, biodiversity, and carbon storage as they cover approximately 30% of the global land area. Nowadays, Machine Learning (ML)is key to automating large-scale tree species classification based on active and passive sensing systems, with a recent trend favoring data fusion approaches for higher accuracy. The use of 3D Deep Learning (DL) models has improved tree species classification by capturing structural and geometric data directly from point clouds. We propose a fully Multimodal Tree Species Classification Network (MMTSCNet) that processes Light Detection and Ranging (LiDAR) point clouds, Full-Waveform (FWF) data, derived features, and bidirectional, color-coded depth images in their native data formats without any modality transformation. We conduct several experiments as well as an ablation study to assess the impact of data fusion. Classification performance on the combination of Airborne Laser Scanning (ALS) data with FWF data scored the highest, achieving an Overall Accuracy (OA) of nearly 97%, a Mean Average F1-score (MAF) of nearly 97%, and a Kappa Coefficient of 0.96. Results for the other data subsets show that the ALS data in combination with or even without FWF data produced the best results, which was closely followed by the UAV-borne Laser Scanning (ULS) data. Additionally, it is evident that the inclusion of FWF data provided significant benefits to the classification performance, resulting in an increase in the MAF of +4.66% for the ALS data, +4.69% for the ULS data under leaf-on conditions, and +2.59% for the ULS data under leaf-off conditions. The proposed model is also compared to a state-of-the-art unimodal 3D-DL model (PointNet++) as well as a feature-based unimodal DL architecture (DSTCN). The MMTSCNet architecture outperformed the other models by several percentage points, depending on the characteristics of the input data. Full article

► Show Figures

Figure 1

29 pages, 4530 KiB

Open AccessSystematic Review

Advances in Deep Learning for Semantic Segmentation of Low-Contrast Images: A Systematic Review of Methods, Challenges, and Future Directions

by Claudio Urrea and Maximiliano Vélez

Sensors 2025, 25(7), 2043; https://doi.org/10.3390/s25072043 - 25 Mar 2025

Viewed by 3275

Abstract

The semantic segmentation (SS) of low-contrast images (LCIs) remains a significant challenge in computer vision, particularly for sensor-driven applications like medical imaging, autonomous navigation, and industrial defect detection, where accurate object delineation is critical. This systematic review develops a comprehensive evaluation of state-of-the-art [...] Read more.

The semantic segmentation (SS) of low-contrast images (LCIs) remains a significant challenge in computer vision, particularly for sensor-driven applications like medical imaging, autonomous navigation, and industrial defect detection, where accurate object delineation is critical. This systematic review develops a comprehensive evaluation of state-of-the-art deep learning (DL) techniques to improve segmentation accuracy in LCI scenarios by addressing key challenges such as diffuse boundaries and regions with similar pixel intensities. It tackles primary challenges, such as diffuse boundaries and regions with similar pixel intensities, which limit conventional methods. Key advancements include attention mechanisms, multi-scale feature extraction, and hybrid architectures combining Convolutional Neural Networks (CNNs) with Vision Transformers (ViTs), which expand the Effective Receptive Field (ERF), improve feature representation, and optimize information flow. We compare the performance of 25 models, evaluating accuracy (e.g., mean Intersection over Union (mIoU), Dice Similarity Coefficient (DSC)), computational efficiency, and robustness across benchmark datasets relevant to automation and robotics. This review identifies limitations, including the scarcity of diverse, annotated LCI datasets and the high computational demands of transformer-based models. Future opportunities emphasize lightweight architectures, advanced data augmentation, integration with multimodal sensor data (e.g., LiDAR, thermal imaging), and ethically transparent AI to build trust in automation systems. This work contributes a practical guide for enhancing LCI segmentation, improving mean accuracy metrics like mIoU by up to 15% in sensor-based applications, as evidenced by benchmark comparisons. It serves as a concise, comprehensive guide for researchers and practitioners advancing DL-based LCI segmentation in real-world sensor applications. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

28 pages, 1683 KiB

Open AccessEditor’s ChoiceArticle

Energy-Saving Geospatial Data Storage—LiDAR Point Cloud Compression

by Artur Warchoł, Karolina Pęzioł and Marek Baścik

Energies 2024, 17(24), 6413; https://doi.org/10.3390/en17246413 - 20 Dec 2024

Cited by 2 | Viewed by 1578

Abstract

In recent years, the growth of digital data has been unimaginable. This also applies to geospatial data. One of the largest data types is LiDAR point clouds. Their large volumes on disk, both at the acquisition and processing stages, and in the final [...] Read more.

In recent years, the growth of digital data has been unimaginable. This also applies to geospatial data. One of the largest data types is LiDAR point clouds. Their large volumes on disk, both at the acquisition and processing stages, and in the final versions translate into a high demand for disk space and therefore electricity. It is therefore obvious that in order to reduce energy consumption, lower the carbon footprint of the activity and sensitize sustainability in the digitization of the industry, lossless compression of the aforementioned datasets is a good solution. In this article, a new format for point clouds—3DL—is presented, the effectiveness of which is compared with 21 available formats that can contain LiDAR data. A total of 404 processes were carried out to validate the 3DL file format. The validation was based on four LiDAR point clouds stored in LAS files: two files derived from ALS (airborne laser scanning), one in the local coordinate system and the other in PL-2000; and two obtained by TLS (terrestrial laser scanning), also with the same georeferencing (local and national PL-2000). During research, each LAS file was saved 101 different ways in 22 different formats, and the results were then compared in several ways (according to the coordinate system, ALS and TLS data, both types of data within a single coordinate system and the time of processing). The validated solution (3DL) achieved CR (compression rate) results of around 32% for ALS data and around 42% for TLS data, while the best solutions reached 15% for ALS and 34% for TLS. On the other hand, the worst method compressed the file up to 424.92% (ALS_PL2000). This significant reduction in file size contributes to a significant reduction in energy consumption during the storage of LiDAR point clouds, their transmission over the internet and/or during copy/transfer. For all solutions, rankings were developed according to CR and CT (compression time) parameters. Full article

(This article belongs to the Special Issue Low-Energy Technologies in Heavy Industries)

► Show Figures

Figure 1

32 pages, 6180 KiB

Open AccessArticle

Improving Sewer Damage Inspection: Development of a Deep Learning Integration Concept for a Multi-Sensor System

by Jan Thomas Jung and Alexander Reiterer

Sensors 2024, 24(23), 7786; https://doi.org/10.3390/s24237786 - 5 Dec 2024

Cited by 1 | Viewed by 2188

Abstract

The maintenance and inspection of sewer pipes are essential to urban infrastructure but remain predominantly manual, resource-intensive, and prone to human error. Advancements in artificial intelligence (AI) and computer vision offer significant potential to automate sewer inspections, improving reliability and reducing costs. However, [...] Read more.

The maintenance and inspection of sewer pipes are essential to urban infrastructure but remain predominantly manual, resource-intensive, and prone to human error. Advancements in artificial intelligence (AI) and computer vision offer significant potential to automate sewer inspections, improving reliability and reducing costs. However, the existing vision-based inspection robots fail to provide data quality sufficient for training reliable deep learning (DL) models. To address these limitations, we propose a novel multi-sensor robotic system coupled with a DL integration concept. Following a comprehensive review of the current 2D (image) and 3D (point cloud) sewage pipe inspection methods, we identify key limitations and propose a system incorporating a camera array, front camera, and LiDAR sensor to optimise surface capture and enhance data quality. Damage types are assigned to the sensor best suited for their detection and quantification, while tailored DL models are proposed for each sensor type to maximise performance. This approach enables the optimal detection and processing of relevant damage types, achieving higher accuracy for each compared to single-sensor systems. Full article

(This article belongs to the Special Issue AI-Based Computer Vision Sensors & Systems)

► Show Figures

Figure 1

28 pages, 18069 KiB

Open AccessArticle

An AI-Based Deep Learning with K-Mean Approach for Enhancing Altitude Estimation Accuracy in Unmanned Aerial Vehicles

by Prot Piyakawanich and Pattarapong Phasukkit

Drones 2024, 8(12), 718; https://doi.org/10.3390/drones8120718 - 29 Nov 2024

Cited by 1 | Viewed by 1682

Abstract

In the rapidly evolving domain of Unmanned Aerial Vehicles (UAVs), precise altitude estimation remains a significant challenge, particularly for lightweight UAVs. This research presents an innovative approach to enhance altitude estimation accuracy for UAVs weighing under 2 kg without cameras, utilizing advanced AI [...] Read more.

In the rapidly evolving domain of Unmanned Aerial Vehicles (UAVs), precise altitude estimation remains a significant challenge, particularly for lightweight UAVs. This research presents an innovative approach to enhance altitude estimation accuracy for UAVs weighing under 2 kg without cameras, utilizing advanced AI Deep Learning algorithms. The primary novelty of this study lies in its unique integration of unsupervised and supervised learning techniques. By synergistically combining K-Means Clustering with a multiple-input deep learning regression-based model (DL-KMA), we have achieved substantial improvements in altitude estimation accuracy. This methodology represents a significant advancement over conventional approaches in UAV technology. Our experimental design involved comprehensive field data collection across two distinct altitude environments, employing a high-precision Digital Laser Distance Meter as the reference standard (Class II). This rigorous approach facilitated a thorough evaluation of our model’s performance across varied terrains, ensuring robust and reliable results. The outcomes of our study are particularly noteworthy, with the model demonstrating remarkably low Mean Squared Error (MSE) values across all data clusters, ranging from 0.011 to 0.072. These results not only indicate significant improvements over traditional methods, but also establish a new benchmark in UAVs altitude estimation accuracy. A key innovation in our approach is the elimination of costly additional hardware such as Light Detection and Ranging (LiDAR), offering a cost-effective, software-based solution. This advancement has broad implications, enhancing the accessibility of advanced UAVs technology and expanding its potential applications across diverse sectors including precision agriculture, urban planning, and emergency response. This research represents a significant contribution to the integration of AI and UAVs technology, potentially unlocking new possibilities in UAVs applications. By enhancing the capabilities of lightweight UAVs, we are not merely improving a technical aspect, but revolutionizing the potential applications of UAVs across industries. Our work sets the stage for safer, more reliable, and precise UAVs operations, marking a pivotal moment in the evolution of aerial technology in an increasingly UAV-dependent world. Full article

(This article belongs to the Special Issue Navigation, Control and Mission Planning Advances for Safe, Efficient and Autonomous Drones)

► Show Figures

Figure 1

14 pages, 6043 KiB

Open AccessArticle

Developing Site-Specific Prescription Maps for Sugarcane Weed Control Using High-Spatial-Resolution Images and Light Detection and Ranging (LiDAR)

by Kerin F. Romero and Muditha K. Heenkenda

Land 2024, 13(11), 1751; https://doi.org/10.3390/land13111751 - 25 Oct 2024

Cited by 1 | Viewed by 1729

Abstract

Sugarcane is a perennial grass species mainly for sugar production and one of the significant crops in Costa Rica, where ideal growing conditions support its cultivation. Weed control is a critical aspect of sugarcane farming, traditionally managed through preventive or corrective mechanical and [...] Read more.

Sugarcane is a perennial grass species mainly for sugar production and one of the significant crops in Costa Rica, where ideal growing conditions support its cultivation. Weed control is a critical aspect of sugarcane farming, traditionally managed through preventive or corrective mechanical and chemical methods. However, these methods can be time-consuming and costly. This study aimed to develop site-specific, variable rate prescription maps for weed control using remote sensing. High-spatial-resolution images (5 cm) and Light Detection And Ranging (LiDAR) were acquired using a Micasense Rededge-P camera and a DJI L1 sensor mounted on a drone. Precise locations of weeds were collected for calibration and validation. Normalized Difference Vegetation Index derived from multispectral images separated vegetation coverage and soil. A deep learning (DL) algorithm further classified vegetation coverage into sugarcane and weeds. The DL model performed well without overfitting. The classification accuracy was 87% compared to validation samples. The density and average heights of weed patches were extracted from the canopy height model (LiDAR). They were used to derive site-specific prescription maps for weed control. This efficient and precise alternative to traditional methods could optimize weed control, reduce herbicide usage and provide more profitable yield. Full article

► Show Figures

Figure 1

37 pages, 92018 KiB

Open AccessArticle

Semantic Mapping of Landscape Morphologies: Tuning ML/DL Classification Approaches for Airborne LiDAR Data

by Marco Cappellazzo, Giacomo Patrucco, Giulia Sammartano, Marco Baldo and Antonia Spanò

Remote Sens. 2024, 16(19), 3572; https://doi.org/10.3390/rs16193572 - 25 Sep 2024

Cited by 1 | Viewed by 2029

Abstract

The interest in the enhancement of innovative solutions in the geospatial data classification domain from integrated aerial methods is rapidly growing. The transition from unstructured to structured information is essential to set up and arrange geodatabases and cognitive systems such as digital twins [...] Read more.

The interest in the enhancement of innovative solutions in the geospatial data classification domain from integrated aerial methods is rapidly growing. The transition from unstructured to structured information is essential to set up and arrange geodatabases and cognitive systems such as digital twins capable of monitoring territorial, urban, and general conditions of natural and/or anthropized space, predicting future developments, and considering risk prevention. This research is based on the study of classification methods and the consequent segmentation of low-altitude airborne LiDAR data in highly forested areas. In particular, the proposed approaches investigate integrating unsupervised classification methods and supervised Neural Network strategies, starting from unstructured point-based data formats. Furthermore, the research adopts Machine Learning classification methods for geo-morphological analyses derived from DTM datasets. This paper also discusses the results from a comparative perspective, suggesting possible generalization capabilities concerning the case study investigated. Full article

(This article belongs to the Special Issue Advances and Challenges in Ultra-High-Resolution Land Cover and Land Use Classification)

► Show Figures

Graphical abstract

31 pages, 3112 KiB

Open AccessArticle

Fusing Multispectral and LiDAR Data for CNN-Based Semantic Segmentation in Semi-Arid Mediterranean Environments: Land Cover Classification and Analysis

by Athanasia Chroni, Christos Vasilakos, Marianna Christaki and Nikolaos Soulakellis

Remote Sens. 2024, 16(15), 2729; https://doi.org/10.3390/rs16152729 - 25 Jul 2024

Cited by 2 | Viewed by 2263

Abstract

Spectral confusion among land cover classes is quite common, let alone in a complex and heterogenous system like the semi-arid Mediterranean environment; thus, employing new developments in remote sensing, such as multispectral imagery (MSI) captured by unmanned aerial vehicles (UAVs) and airborne light [...] Read more.

Spectral confusion among land cover classes is quite common, let alone in a complex and heterogenous system like the semi-arid Mediterranean environment; thus, employing new developments in remote sensing, such as multispectral imagery (MSI) captured by unmanned aerial vehicles (UAVs) and airborne light detection and ranging (LiDAR) techniques, with deep learning (DL) algorithms for land cover classification can help to address this problem. Therefore, we propose an image-based land cover classification methodology based on fusing multispectral and airborne LiDAR data by adopting CNN-based semantic segmentation in a semi-arid Mediterranean area of northeastern Aegean, Greece. The methodology consists of three stages: (i) data pre-processing, (ii) semantic segmentation, and (iii) accuracy assessment. The multispectral bands were stacked with the calculated Normalized Difference Vegetation Index (NDVI) and the LiDAR-based attributes height, intensity, and number of returns converted into two-dimensional (2D) images. Then, a hyper-parameter analysis was performed to investigate the impact on the classification accuracy and training time of the U-Net architecture by varying the input tile size and the patch size for prediction, including the learning rate and algorithm optimizer. Finally, comparative experiments were conducted by altering the input data type to test our hypothesis, and the CNN model performance was analyzed by using accuracy assessment metrics and visually comparing the segmentation maps. The findings of this investigation showed that fusing multispectral and LiDAR data improves the classification accuracy of the U-Net, as it yielded the highest overall accuracy of 79.34% and a kappa coefficient of 0.6966, compared to using multispectral (OA: 76.03%; K: 0.6538) or LiDAR (OA: 37.79%; K: 0.0840) data separately. Although some confusion still exists among the seven land cover classes observed, the U-Net delivered a detailed and quite accurate segmentation map. Full article

(This article belongs to the Special Issue Remote Sensing in Environmental Modelling)

► Show Figures

Figure 1

27 pages, 10879 KiB

Open AccessFeature PaperArticle

Fusion of Google Street View, LiDAR, and Orthophoto Classifications Using Ranking Classes Based on F1 Score for Building Land-Use Type Detection

by Nafiseh Ghasemian Sorboni, Jinfei Wang and Mohammad Reza Najafi

Remote Sens. 2024, 16(11), 2011; https://doi.org/10.3390/rs16112011 - 3 Jun 2024

Cited by 6 | Viewed by 1851

Abstract

Building land-use type classification using earth observation data is essential for urban planning and emergency management. Municipalities usually do not hold a detailed record of building land-use types in their jurisdictions, and there is a significant need for a detailed classification of this [...] Read more.

Building land-use type classification using earth observation data is essential for urban planning and emergency management. Municipalities usually do not hold a detailed record of building land-use types in their jurisdictions, and there is a significant need for a detailed classification of this data. Earth observation data can be beneficial in this regard, because of their availability and requiring a reduced amount of fieldwork. In this work, we imported Google Street View (GSV), light detection and ranging-derived (LiDAR-derived) features, and orthophoto images to deep learning (DL) models. The DL models were trained on building land-use type data for the Greater Toronto Area (GTA). The data was created using building land-use type labels from OpenStreetMap (OSM) and web scraping. Then, we classified buildings into apartment, house, industrial, institutional, mixed residential/commercial, office building, retail, and other. Three DL-derived classification maps from GSV, LiDAR, and orthophoto images were combined at the decision level using the proposed ranking classes based on the F1 score method. For comparison, the classifiers were combined using fuzzy fusion as well. The results of two independent case studies, Vancouver and Fort Worth, showed that the proposed fusion method could achieve an overall accuracy of 75%, up to 8% higher than the previous study using CNNs and the same ground truth data. Also, the results showed that while mixed residential/commercial buildings were correctly detected using GSV images, the DL models confused many houses in the GTA with mixed residential/commercial because of their similar appearance in GSV images. Full article

(This article belongs to the Special Issue Land Surface Feature Extraction from High-Resolution Remote Sensing Imagery)

► Show Figures

Figure 1

21 pages, 6623 KiB

Open AccessFeature PaperArticle

Channel Morphology Change after Restoration: Drone Laser Scanning versus Traditional Surveying Techniques

by Jonathan P. Resop, Coral Hendrix, Theresa Wynn-Thompson and W. Cully Hession

Hydrology 2024, 11(4), 54; https://doi.org/10.3390/hydrology11040054 - 10 Apr 2024

Cited by 2 | Viewed by 3789

Abstract

Accurate and precise measures of channel morphology are important when monitoring a stream post-restoration to determine changes in stability, water quality, and aquatic habitat availability. Practitioners often rely on traditional surveying methods such as a total station for measuring channel metrics (e.g., cross-sectional [...] Read more.

Accurate and precise measures of channel morphology are important when monitoring a stream post-restoration to determine changes in stability, water quality, and aquatic habitat availability. Practitioners often rely on traditional surveying methods such as a total station for measuring channel metrics (e.g., cross-sectional area, width, depth, and slope). However, these methods have limitations in terms of coarse sampling densities and time-intensive field efforts. Drone-based lidar or drone laser scanning (DLS) provides much higher resolution point clouds and has the potential to improve post-restoration monitoring efforts. For this study, a 1.3-km reach of Stroubles Creek (Blacksburg, VA, USA), which underwent a restoration in 2010, was surveyed twice with a total station (2010 and 2021) and twice with DLS (2017 and 2021). The initial restoration was divided into three treatment reaches: T1 (livestock exclusion), T2 (livestock exclusion and bank treatment), and T3 (livestock exclusion, bank treatment, and inset floodplain). Cross-sectional channel morphology metrics were extracted from the 2021 DLS scan and compared to metrics calculated from the 2021 total station survey. DLS produced 6.5 times the number of cross sections over the study reach and 8.8 times the number of points per cross section compared to the total station. There was good agreement between the metrics derived from both surveying methods, such as channel width (R² = 0.672) and cross-sectional area (R² = 0.597). As a proof of concept to demonstrate the advantage of DLS over traditional surveying, 0.1 m digital terrain models (DTMs) were generated from the DLS data. Based on the drone lidar data, from 2017 to 2021, treatment reach T3 showed the most stability, in terms of the least change and variability in cross-sectional metrics as well as the least erosion area and volume per length of reach. Full article

► Show Figures

Graphical abstract

21 pages, 14599 KiB

Open AccessArticle

Transport Infrastructure Management Based on LiDAR Synthetic Data: A Deep Learning Approach with a ROADSENSE Simulator

by Lino Comesaña-Cebral, Joaquín Martínez-Sánchez, Antón Nuñez Seoane and Pedro Arias

Infrastructures 2024, 9(3), 58; https://doi.org/10.3390/infrastructures9030058 - 13 Mar 2024

Cited by 1 | Viewed by 2743

Abstract

In the realm of transportation system management, various remote sensing techniques have proven instrumental in enhancing safety, mobility, and overall resilience. Among these techniques, Light Detection and Ranging (LiDAR) has emerged as a prevalent method for object detection, facilitating the comprehensive monitoring of [...] Read more.

In the realm of transportation system management, various remote sensing techniques have proven instrumental in enhancing safety, mobility, and overall resilience. Among these techniques, Light Detection and Ranging (LiDAR) has emerged as a prevalent method for object detection, facilitating the comprehensive monitoring of environmental and infrastructure assets in transportation environments. Currently, the application of Artificial Intelligence (AI)-based methods, particularly in the domain of semantic segmentation of 3D LiDAR point clouds by Deep Learning (DL) models, is a powerful method for supporting the management of both infrastructure and vegetation in road environments. In this context, there is a lack of open labeled datasets that are suitable for training Deep Neural Networks (DNNs) in transportation scenarios, so, to fill this gap, we introduce ROADSENSE (Road and Scenic Environment Simulation), an open-access 3D scene simulator that generates synthetic datasets with labeled point clouds. We assess its functionality by adapting and training a state-of-the-art DL-based semantic classifier, PointNet++, with synthetic data generated by both ROADSENSE and the well-known HELIOS++ (HEildelberg LiDAR Operations Simulator). To evaluate the resulting trained models, we apply both DNNs on real point clouds and demonstrate their effectiveness in both roadway and forest environments. While the differences are minor, the best mean intersection over union (MIoU) values for highway and national roads are over 77%, which are obtained with the DNN trained on HELIOS++ point clouds, and the best classification performance in forested areas is over 92%, which is obtained with the model trained on ROADSENSE point clouds. This work contributes information on a valuable tool for advancing DL applications in transportation scenarios, offering insights and solutions for improved road and roadside management. Full article

(This article belongs to the Special Issue Emerging Technologies for Effective and Intelligent Transport Infrastructure Monitoring)

► Show Figures

Figure 1

35 pages, 4624 KiB

Open AccessEditor’s ChoiceReview

Emerging Trends in Autonomous Vehicle Perception: Multimodal Fusion for 3D Object Detection

by Simegnew Yihunie Alaba, Ali C. Gurbuz and John E. Ball

World Electr. Veh. J. 2024, 15(1), 20; https://doi.org/10.3390/wevj15010020 - 7 Jan 2024

Cited by 31 | Viewed by 15146

Abstract

The pursuit of autonomous driving relies on developing perception systems capable of making accurate, robust, and rapid decisions to interpret the driving environment effectively. Object detection is crucial for understanding the environment at these systems’ core. While 2D object detection and classification have [...] Read more.

The pursuit of autonomous driving relies on developing perception systems capable of making accurate, robust, and rapid decisions to interpret the driving environment effectively. Object detection is crucial for understanding the environment at these systems’ core. While 2D object detection and classification have advanced significantly with the advent of deep learning (DL) in computer vision (CV) applications, they fall short in providing essential depth information, a key element in comprehending driving environments. Consequently, 3D object detection becomes a cornerstone for autonomous driving and robotics, offering precise estimations of object locations and enhancing environmental comprehension. The CV community’s growing interest in 3D object detection is fueled by the evolution of DL models, including Convolutional Neural Networks (CNNs) and Transformer networks. Despite these advancements, challenges such as varying object scales, limited 3D sensor data, and occlusions persist in 3D object detection. To address these challenges, researchers are exploring multimodal techniques that combine information from multiple sensors, such as cameras, radar, and LiDAR, to enhance the performance of perception systems. This survey provides an exhaustive review of multimodal fusion-based 3D object detection methods, focusing on CNN and Transformer-based models. It underscores the necessity of equipping fully autonomous vehicles with diverse sensors to ensure robust and reliable operation. The survey explores the advantages and drawbacks of cameras, LiDAR, and radar sensors. Additionally, it summarizes autonomy datasets and examines the latest advancements in multimodal fusion-based methods. The survey concludes by highlighting the ongoing challenges, open issues, and potential directions for future research. Full article

► Show Figures

Graphical abstract

30 pages, 40008 KiB

Open AccessArticle

Contribution of Geometric Feature Analysis for Deep Learning Classification Algorithms of Urban LiDAR Data

by Fayez Tarsha Kurdi, Wijdan Amakhchan, Zahra Gharineiat, Hakim Boulaassal and Omar El Kharki

Sensors 2023, 23(17), 7360; https://doi.org/10.3390/s23177360 - 23 Aug 2023

Cited by 9 | Viewed by 2674

Abstract

The use of a Machine Learning (ML) classification algorithm to classify airborne urban Light Detection And Ranging (LiDAR) point clouds into main classes such as buildings, terrain, and vegetation has been widely accepted. This paper assesses two strategies to enhance the effectiveness of [...] Read more.

The use of a Machine Learning (ML) classification algorithm to classify airborne urban Light Detection And Ranging (LiDAR) point clouds into main classes such as buildings, terrain, and vegetation has been widely accepted. This paper assesses two strategies to enhance the effectiveness of the Deep Learning (DL) classification algorithm. Two ML classification approaches are developed and compared in this context. These approaches utilize the DL Pipeline Network (DLPN), which is tailored to minimize classification errors and maximize accuracy. The geometric features calculated from a point and its neighborhood are analyzed to select the features that will be used in the input layer of the classification algorithm. To evaluate the contribution of the proposed approach, five point-clouds datasets with different urban typologies and ground topography are employed. These point clouds exhibit variations in point density, accuracy, and the type of aircraft used (drone and plane). This diversity in the tested point clouds enables the assessment of the algorithm’s efficiency. The obtained high classification accuracy between 89% and 98% confirms the efficacy of the developed algorithm. Finally, the results of the adopted algorithm are compared with both rule-based and ML algorithms, providing insights into the positioning of DL classification algorithms among other strategies suggested in the literature. Full article

(This article belongs to the Special Issue State-of-the-Art Multimodal Remote Sensing Technologies)

► Show Figures

Figure 1

33 pages, 4823 KiB

Open AccessEditor’s ChoiceArticle

NR5G-SAM: A SLAM Framework for Field Robot Applications Based on 5G New Radio

by Panagiotis T. Karfakis, Micael S. Couceiro and David Portugal

Sensors 2023, 23(11), 5354; https://doi.org/10.3390/s23115354 - 5 Jun 2023

Cited by 11 | Viewed by 5357

Abstract

Robot localization is a crucial task in robotic systems and is a pre-requisite for navigation. In outdoor environments, Global Navigation Satellite Systems (GNSS) have aided towards this direction, alongside laser and visual sensing. Despite their application in the field, GNSS suffers from limited [...] Read more.

Robot localization is a crucial task in robotic systems and is a pre-requisite for navigation. In outdoor environments, Global Navigation Satellite Systems (GNSS) have aided towards this direction, alongside laser and visual sensing. Despite their application in the field, GNSS suffers from limited availability in dense urban and rural environments. Light Detection and Ranging (LiDAR), inertial and visual methods are also prone to drift and can be susceptible to outliers due to environmental changes and illumination conditions. In this work, we propose a cellular Simultaneous Localization and Mapping (SLAM) framework based on 5G New Radio (NR) signals and inertial measurements for mobile robot localization with several gNodeB stations. The method outputs the pose of the robot along with a radio signal map based on the Received Signal Strength Indicator (RSSI) measurements for correction purposes. We then perform benchmarking against LiDAR-Inertial Odometry Smoothing and Mapping (LIO-SAM), a state-of-the-art LiDAR SLAM method, comparing performance via a simulator ground truth reference. Two experimental setups are presented and discussed using the sub-6 GHz and mmWave frequency bands for communication, while the transmission is based on down-link (DL) signals. Our results show that 5G positioning can be utilized for radio SLAM, providing increased robustness in outdoor environments and demonstrating its potential to assist in robot localization, as an additional absolute source of information when LiDAR methods fail and GNSS data is unreliable. Full article

(This article belongs to the Special Issue Sensor Based Perception for Field Robotics)

► Show Figures

Figure 1

Search Results (32)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (32)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI